Methods for identifying small molecules that modulate premature translation termination and nonsense mediated mrna decay

ABSTRACT

The present invention relates to a method for screening and identifying compounds that modulate premature translation termination and/or nonsense-mediated messenger ribonucleic acid (“mRNA”) by interacting with a preselected target ribonucleic acid (“RNA”). In particular, the present invention relates to identifying compounds that bind to regions of the 28S ribosomal RNA (“rRNA”) and analogs thereof. Direct, noncompetitive binding assays are advantageously used to screen libraries of compounds for those that selectively bind to a preselected target RNA. Binding of target RNA molecules to a particular compound is detected using any physical method that measures the altered physical property of the target RNA bound to a compound. The structure of the compound attached to the labeled RNA is also determined. The methods used will depend, in part, on the nature of the library screened. The methods of the present invention provide a simple, sensitive assay for high-throughput screening of libraries of compounds to identify pharmaceutical leads.

This application is entitled to and claims priority benefit to U.S.Provisional Patent Application No. 60/398,344, filed Jul. 24, 2002 andU.S. Provisional Patent Application No.60/398,332, filed Jul. 24,2002,both of which are incorporated herein by reference in their entirety.

1. INTRODUCTION

The present invention relates to a method for screening and identifyingcompounds that modulate premature translation termination and/ornonsense-mediated messenger ribonucleic acid (“mRNA”) decay byinteracting with a preselected target ribonucleic acid (“RNA”). Inparticular, the present invention relates to methods of identifyingcompounds that bind to regions of the 28S ribosomal RNA (“rRNA”) andanalogs thereof Direct, non-competitive binding assays areadvantageously used to screen libraries of compounds for those thatselectively bind to a preselected target RNA. Binding of target RNAmolecules to a particular compound is detected using any physical methodthat measures the altered physical property of the target RNA bound to acompound. The methods of the present invention provide a simple,sensitive assay for high-throughput screening of libraries of compoundsto identify pharmaceutical leads.

2. BACKGROUND OF THE INVENTION

Protein synthesis encompasses the processes of translation initiation,elongation, and termination, each of which has evolved to occur withgreat accuracy and has the capacity to be a regulated step in thepathway of gene expression. Recent studies, including those suggestingthat events at termination may regulate the ability of ribosomes torecycle to the start site of the same mRNA, have underscored thepotential of termination to regulate other aspects of translation. TheRNA triplets UAA, UAG, and UGA are non-coding and promote translationaltermination. Termination starts when one of the three termination codonsenters the A site of the ribosome, thereby signaling the polypeptidechain release factors to bind and recognize the termination signal.Subsequently, the ester bond between the 3′ nucleotide of the transferRNA (“tRNA”) located in the ribosome's P site and the nascentpolypeptide chain is hydrolyzed, the completed polypeptide chain isreleased, and the ribosome subunits are recycled for another round oftranslation.

Nonsense-mediated mRNA decay is a surveillance mechanism that minimizesthe translation and regulates the stability of RNAs that contain chaintermination nonsense mutations (see, e.g., Hentze & Kulozik, 1999, Cell96:307-310; Culbertson, 1999, Trends in Genetics 15:7480; Li &Wilkinson, 1998, Immunity 8:135-141; and Ruiz-Echevarria et al., 1996,Trends in Biological Sciences, 21:433-438). Chain termination nonsensemutations are caused when a base substitution or frameshift mutationchanges a codon into a termination codon, i.e., a premature stop codonthat causes translational termination. In nonsense-mediated mRNA decay,mRNAs with premature stop codons are frequently subjected todegradation. A truncated protein is produced as a result of thetranslation apparatus prematurely terminating at the stop codon.

Nonsense mutations cause approximately 10 to 30 percent of theindividual cases of virtually all inherited diseases. Although nonsensemutations inhibit the synthesis of a full-length protein to one percentor less of wild-type levels, minimally boosting the expression levels ofthe full-length protein to between five and fifteen percent of normallevels can eliminate or greatly reduce the severity of disease. Nonsensesuppression causes the read-through of a termination codon and thegeneration of full-length protein. Certain aminoglycosides have beenfound to promote nonsense suppression (see, e.g. Bedwell et al., 1997,Nat. Med. 3:1280-1284 and Howard et al., 1996, Nat. Med. 2:467-469).Clinical approaches that target the translation termination event topromote nonsense suppression have recently been described for modelsystems of cystic fibrosis and muscular dystrophy; gentamicin is anaminoglycoside antibiotic that causes translational misreading andallows the insertion of an amino acid at the site of the nonsense codonin models of cystic fibrosis, Hurlers Syndrome, and muscular dystrophy(see, e.g., Barton-Davis et al., 1999, J. Clin. Invest. 104:375-381).These results strongly suggest that drugs that promote nonsensesuppression by altering translation termination efficiency of apremature termination codon can be therapeutically valuable in thetreatment of diseases caused by nonsense mutations.

Certain classes of known antibiotics have been characterized and foundto interact with RNA. For example, the antibiotic thiostrepton bindstightly to a 60-mer from ribosomal RNA (Cundliffe et al., 1990, in TheRibosome: Structure, Function & Evolution (Schlessinger et al., eds.)American Society for Microbiology, Washington, D.C. pp. 479-490), andbacterial resistance to various antibiotics often involves methylationat specific rRNA sites (Cundliffe, 1989, Ann. Rev. Microbiol.43:207-233). In addition, certain aminoglycosides and other proteinsynthesis inhibitors have been found to interact with specific bases in16S rRNA (Woodcock et al., 1991, EMBO J. 10:3099-3103); moreover, anoligonucleotide analog of the 16S rRNA has been shown to interact withcertain aminoglycosides (Purohit et al., 1994, Nature 370:659-662).Aminoglycosidic aminocyclitol (aminoglycoside) antibiotics and peptideantibiotics are known to inhibit group I intron splicing by binding tospecific regions of the RNA (von Ahsen et al., 1991, Nature (London)353:368-370). Some of these same aminoglycosides have also been found toinhibit hammerhead ribozyme function (Stage et al., 1995, RNA 1:95-101).A molecular basis for hypersensitivity to aminoglycosides has been foundto be located in a single base change in mitochondrial rRNA (Hutchin etal., 1993, Nucleic Acids Res. 21:4174-4179). Aminoglycosides have alsobeen shown to inhibit the interaction between specific structural RNAmotifs and the corresponding RNA binding protein. Zapp et al. (Cell,1993, 74:969-978) has demonstrated that the aminoglycosides neomycin B,lividomycin A, and tobramycin can block the binding of Rev, a viralregulatory protein required for viral gene expression, to its viralrecognition element in the IIB (or RRE) region of HIV RNA. This blockageappears to be the result of competitive binding of the antibioticsdirectly to the RRE RNA structural motif.

Citation or identification of any reference in Section 2 of thisapplication is not an admission that such reference is available asprior art to the present invention.

3. SUMMARY OF THE INVENTION

The present invention provides methods for identifying compounds thatmodulate translation termination and/or nonsense-mediated mRNA decay byidentifying compounds that bind to preselected target elements ofnucleic acids including, but not limited to, specific RNA sequences, RNAstructural motifs, and/or RNA structural elements. In particular, thepresent invention provides methods of identifying compounds that bind toregions of the 28S rRNA and analogs thereof The specific target RNAsequences, RNA structural motifs, and/or RNA structural elements (i.e.,regions or fragments of the 28S rRNA and analogs thereof) are used astargets for screening small molecules and identifying those thatdirectly bind these specific sequences, motifs, and/or structuralelements. For example, methods are described in which a preselectedtarget RNA having a detectable label or method of detection is used toscreen a library of compounds, preferably under physiologic conditions;and any complexes formed between the target RNA and a member of thelibrary are identified using physical methods that detect the labeled oraltered physical property of the target RNA bound to a compound Further,methods are described in which a preselected target RNA is used toscreen a library of compounds, with each compound in the library havinga detectable label or method of detection, preferably under physiologicconditions; and any complexes formed between the target RNA and a memberof the library are identified using physical methods that detect thelabeled or altered physical property of the compound bound to targetRNA.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA, or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA having a detectable label with a library of compounds free insolution, in, e.g., labeled tubes or microtiter plate, and detecting theformation of a target RNA:compound complex. In particular, the presentinvention provides methods for identifying compounds that bind to atarget RNA (e.g., regions or fragments of the 28S rRNA, or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA having a detectable label with a library of compounds free insolution, in, e.g., labeled tubes or a microtiter plate, and detectingthe formation of a target RNA:compound complex. Compounds in the librarythat bind to the labeled target RNA will form a detectably labeledcomplex. The detectably labeled complex can then be identified andremoved from the uncomplexed, unlabeled complex, and from uncomplexed,labeled target RNA, by a variety of methods, including, but not limitedto, methods that differentiate changes in the electrophoretic,chromatographic, or thermostable properties of the complexed target RNA.Such methods include, but are not limited to, electrophoresis,fluorescence spectroscopy, surface plasmon resonance, mass spectrometry,scintillation proximity assay, structure-activity relationships (“SARS”)by NMR spectroscopy, size exclusion chromatography, affinitychromatography, and nanoparticle aggregation.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions of 28S rRNA or RNA containing apremature stop codon), said methods comprising contacting a target RNAhaving a detectable label with a library of compounds bound, whereineach compound in the library is attached to a solid support, anddetecting the formation of a target RNA:compound complex. In particular,the present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA, or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA having a detectable label with a library of compounds whereineach compound is attached to a solid support, (e.g., a bead-basedlibrary of compounds or a microarray of compounds), and detecting theformation of a target RNA:compound complex. Compounds in the librarythat bind to the labeled target RNA will form a detectably labeledcomplex. Compounds in the library that bind to the labeled target RNAwill form a solid support detectably labeled complex (e.g., abead-based-detectably labeles complex), which can be separated from theunbound solid support, (e.g., beads) and unbound target RNA in theliquid phase by a number of physical means, including, but not limitedto, flow cytometry, affinity chromatography, manual batch modeseparation, suspension of beads in electric fields, and microwave of thebead-based detectably labeled complex.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA with a library of compounds, wherein each compound in thelibrary is detectably labeled, and detecting the formation of a targetRNA:compound complex. In particular, the present invention providesmethods for identifying compounds that bind to a target RNA (e.g.regions or fragments of 28S rRNA, or RNA containing a premature stopcodon), said methods comprising contacting a target RNA with a libraryof compounds free in solution, in e.g., labeled tubes or a microtiterplate, wherein each compound in the library is detectably labeled, anddetecting the formation of a target RNA:compound complex. Compounds inthe library that bind to the labeled target RNA will form a detectablylabeled complex. The detectably labeled complex can then be identifiedand removed from the uncomplexed, unlabeled complex, and fromuncomplexed, target RNA, by a variety of methods, including, but notlimited to, methods that differentiate changes in the electrophoretic,chromatographic, or thermostable properties of the complexed target RNA.Such methods include, but are not limited to, electrophoresis,fluorescence spectroscopy, surface plasmon resonance, mass spectrometry,scintillation proximity assay, structure-activity relationships (“SARS”)by NMR spectroscopy, size exclusion chromatography, affinitychromatography, and nanoparticle aggregation.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA attached or conjugated to a solid support with a library ofcompounds, wherein each compound in the library is detectably labeled,and detecting the formation of a target RNA:compound complex. Target RNAmolecules that bind to labeled compounds will form a detectable labeledcomplex. Target RNA molecules that bind to labeled compounds will formsolid support-detectably labeled complex, which can be separated fromunbound solid support-target RNA and unbound labeled compounds in theliquid phase by a number of means, including, but not limited to, flowcytometry, affinity chromatography, manual batch mode separation,suspension of beads in electric fields, and microwave of the bead-baseddetectably labeled complex.

In a specific embodiment, the invention provides a method foridentifying a compound that binds to a target RNA, said methodcomprising: (a) contacting a detectably labeled target RNA molecule witha library of compounds under conditions that permit direct binding ofthe labeled target RNA to a member of the library of compounds and theformation of a detectably labeled target RNA:compound complex, whereinthe target RNA is a region or fragment of 28S rRNA, or contains apremature stop codon; and (b) detecting the formation of a labeledtarget RNA:compound complex. In another embodiment, the inventionprovides a method for identifying a compound that binds to a target RNA,said method comprising detecting the formation of a detectably labeledtarget RNA:compound complex formed from contacting a detectably labeledRNA with a member of a library of compounds under conditions that permitdirect binding of the labeled target RNA to a member of the library ofcompounds and the formation of a labeled target RNA:compound complex,wherein the target RNA is a region or fragment of 28S rRNA, or containsa premature stop codon. In accordance with these embodiments, eachcompound in the library may be attached to a solid support. Non-limitingexamples of solid supports include a silica gel, a resin, a derivatizedplastic film, a glass bead, cotton, a plastic bead, a polystyrene bead,an aluminum gel, a glass slide or a polysaccharide.

In another specific embodiment, the invention provides a method foridentifying a compound that binds to a target RNA, said methodcomprising: (a) contacting a target RNA molecule with a library ofdetectably labeled compounds under conditions that permit direct bindingof the target RNA to a member of the library of labeled compounds andthe formation of a detectable target RNA:compound complex, wherein thetarget RNA is a region or fragment of 28S rRNA, or contains a prematurestop codon; and (b) detecting the formation of a target RNA:compoundcomplex. In another embodiment, the invention provides a method foridentifying a compound that binds to a target RNA, said methodcomprising detecting the formation of a target RNA:compound complexformed from contacting a RNA with a member of a library of detectablylabeled compounds under conditions that permit direct binding of thetarget RNA to a member of the library of labeled compounds and theformation of a target RNA:compound complex, wherein the target RNA is aregion or fragment of 28S rRNA, or contains a premature stop codon. Inaccordance with these embodiments, the target RNA may be attached to asolid support. Non-limiting examples of solid supports include a silicagel, a resin, a derivatized plastic film, a glass bead, cotton, aplastic bead, a polystyrene bead, an aluminum gel, a glass slide or apolysaccharide.

In another specific embodiment, the invention provides a method foridentifying a compound that binds to a target RNA, said methodcomprising: (a) contacting a detectably labeled target RNA molecule witha library of detectably labeled compounds under conditions that permitdirect binding of the labeled target RNA to a member of the library oflabeled and the formation of a detectable target RNA:compound complex,wherein the target RNA is a region or fragment of 28S rRNA, or containsa premature stop codon; and (b) detecting the formation of a targetRNA:compound complex. In another embodiment, the invention provides amethod for identifying a compound that binds to a target RNA, saidmethod comprising detecting the formation of a target RNA:compoundcomplex formed from contacting a labeled RNA with a member of a libraryof detectably labeled compounds under conditions that permit directbinding of the labeled target RNA to a member of the library of labeledcompounds and the formation of a target RNA:compound complex, whereinthe target RNA is a region or fragment of 28S rRNA or contains apremature stop codon. In accordance with these embodiments, the targetRNA may be attached to a solid support. Non-limiting examples of solidsupports are provided infra. A number of techniques can be used todetect the interaction between target RNA and the compounds of theinvention. In a specific embodiment, fluorescence resonance energytransfer (FRET) is used to detect the interaction between the target RNAand the compound of the invention. Examples of FRET assays are known inthe art and are also provided herein (see, e.g., Section 5.6.2).

The methods described herein for the identification of compounds thatdirectly bind to 28S rRNA or a RNA containing a premature stop codon arewell suited for high-throughput screening. The direct binding method ofthe invention offers advantages over drug screening systems forcompetitors that inhibit the formation of naturally-occurring RNAbinding protein:target RNA complexes; i.e., competitive assays. Thedirect binding method of the invention is rapid and can be set up to bereadily performed, e.g., by a technician, making it amenable tohigh-throughput screening. The methods of the invention also eliminatethe bias inherent in the competitive drug screening systems, whichrequire the use of a preselected host cell factor that may not havephysiological relevance to the activity of the target RNA. Instead, themethods of the invention are used to identify any compound that candirectly bind to a target RNA, (e.g. 28S rRNA or a RNA containing apremature stop codon), preferably under physiologic conditions. As aresult, the compounds so identified can inhibit the interaction of thetarget RNA with any one or more of the native host cell factors (whetherknown or unknown) required for activity of the RNA in vivo.

The compounds utilized in the assays described herein may be members ofa library of compounds. In specific embodiment, the compound is selectedfrom a combinatorial library of compounds comprising peptides; randombiooligomers; diversomers such as hydantoins, benzodiazepines anddipeptides; vinylogous polypeptides; nonpeptidal peptidomimetics;oligocarbamates; peptidyl phosphonates; peptide nucleic acid libraries;antibody libraries; carbohydrate libraries; and small organic moleculelibraries. In a preferred embodiment, the small organic moleculelibraries are libraries of benzodiazepines, isoprenoids,thiazolidinones, metathiazanones, pyrrolidines, morpholino compounds, ordiazepindiones.

In certain embodiments, the compounds are screened in pools. Once apositive pool has been identified, the individual compounds of that poolare tested separately. In certain embodiments, the pool size is at least2, at least 5, at least 10, at least 25, at least 50, at least 75, atleast 100, at least 150, at least 200, at least 250, or at least 500compounds.

Once a compound is identified in accordance with the invention, thestructure of the compound may be determined utilizing well-knowntechniques or by referring to a predetermined code. The methods usedwill depend, in part, on the nature of the library screened. Forexample, assays of microarrays of compounds, each having an address oridentifier, may be deconvoluted, e.g., by cross-referencing the positivesample to original compound list that was applied to the individual testassays. Another method for identifying compounds includes de novostructure determination of the compounds using, for example, massspectrometry or nuclear magnetic resonance (“NMR”). The compoundsidentified are useful for any purpose to which a binding reaction may beput, for example in assay methods, diagnostic procedures, cell sorting,as inhibitors of target molecule function, as probes, as sequesteringagents and the like. In addition, small organic molecules which interactspecifically with target RNA molecules may be useful as lead compoundsfor the development of therapeutic agents.

A compound identified in accordance with the methods of the inventionmay bind to a premature stop codon. A compound identified in accordancewith the methods of invention may also disrupt an interaction between apremature stop codon and the mRNA translation machinery. In a preferredembodiment, a compound identified in accordance with the methods of theinvention binds to RNA and suppresses premature translation terminationand/or nonsense-mediated mRNA decay of a gene encoding a protein,polypeptide or peptide whose expression is beneficial to a subject. Inanother preferred embodiment, a compound identified in accordance withthe methods of the invention binds to RNA and increases prematuretranslation termination and/or nonsense-mediated mRNA decay of a geneencoding a protein, polypeptide or peptide whose expression isdetrimental to a subject. In a specific embodiment, a compoundidentified in accordance with the methods of the inventionpreferentially or differentially modulates premature translationtermination and/or nonsense-mediated mRNA decay of a specific nucleotidesequence of interest relative to another nucleotide sequence.

In certain embodiments of the invention, the compound identified usingthe assays described herein is a small molecule. In a preferredembodiment, the compound identified using the assays described herein isnot known to affect premature translation termination and/ornonsense-mediated mRNA decay of a nucleic acid sequence, in particular anucleic acid sequence of interest. In another preferred embodiment, thecompound identified using the assays described herein has not been usedas or suggested to be used in the prevention, treatment, managementand/or amelioration of a disorder associated with, characterized by orcaused by a premature stop codon. In another preferred embodiment, thecompound identified using the assays described herein has not been usedas or suggested to be used in the prevention, treatment, managementand/or amelioration of a particular disorder described herein.

A compound identified in accordance with the methods of the inventionmay be tested in ill vitro and/or in vivo assays well-known to one ofskill in the art or described herein to determine the prophylactic ortherapeutic effect of a particular compound for a particular disorder.In particular, a compound identified utilizing the assays describedherein may be tested in an animal model to determine the efficacy of thecompound in the prevention, treatment or amelioration of a disorderassociated with, characterized by or caused by a premature stop codon,or a disorder described herein, or a symptom thereof. In addition, acompound identified utilizing the assays described herein may be testedfor its toxicity in in vitro and/or in vivo assays well-known to one ofskill in the art. Further, a compound identified as binding to a targetRNA utilizing assays described herein or those well-known in the art maybe tested for its ability to modulate premature translation and/ornonsense mediated mRNA decay.

In a specific embodiment, the invention provides a method foridentifying a compound to test for its ability to modulate prematuretranslation termination or nonsense-mediated mRNA decay, said methodcomprising: (a) contacting a detectably labeled target RNA molecule witha library of compounds under conditions that permit direct binding ofthe labeled target RNA to a member of the library of compounds and theformation of a detectably labeled target RNA:compound complex, whereinthe target RNA is a region or fragment of 28S rRNA or contains apremature stop codon; and (b) detecting a detectably labeled targetRNA:compound complex formed in step(a), so that if a target RNA:compoundcomplex is detected then the compound identified is tested for itsability to modulate premature translation or nonsense-mediated mRNAdecay.

In a specific embodiment, the invention provides a method foridentifying a compound to test for its ability to modulate prematuretranslation termination or nonsense-mediated mRNA decay, said methodcomprising: (a) contacting a target RNA molecule with a library ofdetectably labeled compounds under conditions that permit direct bindingof the target RNA to a member of the library of labeled compounds andthe formation of a detectably labeled target RNA:compound complex,wherein the target RNA is a region or fragment of 28S rRNA or contains apremature stop codon; and (b) detecting a detectably labeled targetRNA:compound complex formed in step(a), so that if a target RNA:compoundcomplex is detected then the compound identified is tested for itsability to modulate premature translation or nonsense-mediated mRNAdecay.

The invention provides cell-based and cell-free assays to test theability of a compound identified in accordance with the methods of theinvention to modulate premature translation termination and/ornonsense-mediated mRNA decay. In particular, the invention providescell-based and cell-free reporter assays for the identification of acompound that modulates premature translation termination and/ornonsense-mediated mRNA decay. In general, the level of expression and/oractivity of a reporter gene product in the reporter gene based-assaysdescribed herein is indicative of the effect of the compound onpremature translation termination and/or nonsense-mediated mRNA decay.The reporter gene-based assays described herein for the identificationof compounds that modulate premature translation termination and/ornonsense-mediated mRNA decay are well suited for high-throughputscreening.

The reporter gene cell-based assays may be conducted by contacting acompound with a cell containing a nucleic acid sequence comprising areporter gene, wherein the reporter gene contains a premature stop codonor nonsense mutation, and measuring the expression of the reporter gene.The reporter gene cell-free assays may be conducted by contacting acompound with a cell-free extract and a nucleic acid sequence comprisinga reporter gene, wherein the reporter gene contains a premature stopcodon or nonsense mutation, and measuring the expression of the reportergene. In the cell-based and cell-free reporter gene assays describedherein, the alteration in reporter gene expression or activity relativeto a previously determined reference range, or to the expression oractivity of the reporter gene in the absence of the compound or thepresence of an appropriate control (e.g., a negative control) indicatesthat a particular compound modulates premature translation terminationand/or nonsense-mediated mRNA decay. In particular, an increase inreporter gene expression or activity relative to a previously determinedreference range, or to the expression in the absence of the compound orthe presence of an appropriate control (e.g., a negative control) may,depending upon the parameters of the reporter gene assay, indicate thata particular compound reduces or suppresses premature translationtermination and/or nonsense-mediated mRNA decay (i.e., increasesnonsense suppression). In contrast, a decrease in reporter geneexpression or activity relative to a previously determined referencerange, or to the expression in the absence of the compound or thepresence of an appropriate control (e.g., a negative control) may,depending upon the parameters of the reporter gene-based assay, indicatethat a particular compound enhances premature translation terminationand/or nonsense-mediated mRNA decay (i.e., decreases nonsensesuppression).

In a specific embodiment, the invention provides a method of identifyinga compound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting adetectably labeled target RNA molecule with a library of compounds underconditions that permit direct binding of the labeled target RNA to amember of the library of compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); so that if a targetRNA:compound complex is detected, then (c) contacting the compound witha cell-free translation mixture and a nucleic acid sequence comprising aregulatory element operably linked to a reporter gene, wherein thereporter gene contains a premature stop codon; and (d) detecting theexpression of the reporter gene, wherein a compound that modulatespremature translation termination or nonsense-mediated mRNA decay isidentified if the expression of the reporter gene in the presence of thecompound is altered relative to the expression of the reporter gene inthe absence of the compound or the presence of a negative control. Inaccordance with this embodiment, each compound in the library may beattached to a solid support.

In another embodiment, the invention provides a method of identifying acompound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting adetectably labeled target RNA molecule with a library of compounds underconditions that permit direct binding of the labeled target RNA to amember of the library of compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); so that if a target RNAcompound complex is detected, then (c) contacting the compound with acell containing a nucleic acid sequence comprising a regulatory elementoperably linked to a reporter gene, wherein the reporter gene contains apremature stop codon; and (d) detecting the expression of the reportergene, wherein a compound that modulates premature translationtermination or nonsense-mediated mRNA decay is identified if theexpression of the reporter gene in the presence of the compound isaltered relative to the expression of the reporter gene in the absenceof the compound or the presence of a negative control. In accordancewith this embodiment, each compound in the library may be attached to asolid support.

In a specific embodiment, the invention provides a method of identifyinga compound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting adetectably labeled target RNA molecule with a library of compounds underconditions that permit direct binding of the labeled target RNA to amember of the library of compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); then (c) contacting thecompound with a cell-free translation mixture and a nucleic acidsequence comprising a regulatory element operably linked to a reportergene, wherein the reporter gene contains a premature stop codon; and (d)detecting the expression of the reporter gene, wherein a compound thatmodulates premature translation termination or nonsense-mediated mRNAdecay is identified if the expression of the reporter gene in thepresence of the compound is altered relative to the expression of thereporter gene in the absence of the compound or the presence of anegative control. In accordance with this embodiment, each compound inthe library may be attached to a solid support.

In another embodiment, the invention provides a method of identifying acompound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting adetectably labeled target RNA molecule with a library of compounds underconditions that permit direct binding of the labeled target RNA to amember of the library of compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); then (c) contacting thecompound with a cell containing a nucleic acid sequence comprising aregulatory element operably linked to a reporter gene, wherein thereporter gene contains a premature stop codon; and (d) detecting theexpression of the reporter gene, wherein a compound that modulatespremature translation termination or nonsense-mediated mRNA decay isidentified if the expression of the reporter gene in the presence of thecompound is altered relative to the expression of the reporter gene inthe absence of the compound or the presence of a negative control. Inaccordance with this embodiment, each compound in the library may beattached to a solid support.

In a specific embodiment, the invention provides a method of identifyinga compound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting atarget RNA molecule with a library of detectably labeled compounds underconditions that permit direct binding of the target RNA to a member ofthe library of labeled compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); then (c) contacting thecompound with a cell-free translation mixture and a nucleic acidsequence comprising a regulatory element operably linked to a reportergene, wherein the reporter gene contains a premature stop codon; and (d)detecting the expression of the reporter gene, wherein a compound thatmodulates premature translation termination or nonsense-mediated mRNAdecay is identified if the expression of the reporter gene in thepresence of the compound is altered relative to the expression of thereporter gene in the absence of the compound or the presence of anegative control. In accordance with this embodiment, the target RNA maybe attached or conjugated to a solid support, or detectably labeled.

In another embodiment, the invention provides a method of identifying acompound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting atarget RNA molecule with a library of detectably labeled compounds underconditions that permit direct binding of the target RNA to a member ofthe library of labeled compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); then (c) contacting thecompound with a cell containing a nucleic acid sequence comprising aregulatory element operably linked to a reporter gene, wherein thereporter gene contains a premature stop codon; and (d) detecting theexpression of the reporter gene, wherein a compound that modulatespremature translation termination or nonsense-mediated mRNA decay isidentified if the expression of the reporter gene in the presence of thecompound is altered relative to the expression of the reporter gene inthe absence of the compound or the presence of a negative control. Inaccordance with this embodiment, the target RNA may be attached orconjugated to a solid support, or detectably labeled.

In a specific embodiment, the invention provides a method of identifyinga compound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting atarget RNA molecule with a library of detectably labeled compounds underconditions that permit direct binding of the target RNA to a member ofthe library of labeled compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); so that if a targetRNA:compound complex is detected, then (c) contacting the compound witha cell-free translation mixture and a nucleic acid sequence comprising aregulatory element operably linked to a reporter gene, wherein thereporter gene contains a premature stop codon; and (d) detecting theexpression of the reporter gene, wherein a compound that modulatespremature translation termination or nonsense-mediated mRNA decay isidentified if the expression of the reporter gene in the presence of thecompound is altered relative to the expression of the reporter gene inthe absence of the compound or the presence of a negative control. Inaccordance with this embodiment, the target RNA may be attached orconjugated to a solid support, or detectably labeled.

In another embodiment, the invention provides a method of identifying acompound that modulates premature translation termination ornonsense-mediated mRNA decay, said method comprising: (a) contacting atarget RNA molecule with a library of detectably labeled compounds underconditions that permit direct binding of the target RNA to a member ofthe library of labeled compounds and the formation of a detectablylabeled target RNA:compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; (b) detecting a labeledtarget RNA:compound complex formed in step(a); so that if a targetRNA:compound complex is detected, then (c) contacting the compound witha cell containing a nucleic acid sequence comprising a regulatoryelement operably linked to a reporter gene, wherein the reporter genecontains a premature stop codon; and (d) detecting the expression of thereporter gene, wherein a compound that modulates premature translationtermination or nonsense-mediated mRNA decay is identified if theexpression of the reporter gene in the presence of the compound isaltered relative to the expression of the reporter gene in the absenceof the compound or the presence of a negative control. In accordancewith this embodiment, the target RNA may be attached or conjugated to asolid support, or detectably labeled.

The invention provides methods for preventing, treating, managing orameliorating a disorder associated with, characterized by or caused by apremature translation termination and/or nonsense-mediated mRNA decay ora symptom thereof, said method comprising administering to a subject inneed thereof a therapeutically or prophylactically effective amount of acompound, or a pharmaceutically acceptable salt thereof, identifiedaccording to the methods described herein.

The present invention may be understood more fully by reference to thedetailed description and examples, which are intended to illustratenon-limiting embodiments of the invention.

3.1 Terminology

As used herein, the term “compound” refers to any agent or complex thatis being tested for its ability to interact with a target nucleic acid(in particular, a target RNA) or has been identified as interacting witha target nucleic acid (in particular, a target RNA).

As used herein, the terms “disorder” and “disease” are to refer to acondition in a subject.

As used herein, a “dye” refers to a molecule that, when exposed toradiation, emits radiation at a level that is detectable visually or viaconventional spectroscopic means. As used herein, a “visible dye” refersto a molecule having a chromophore that absorbs radiation in the visibleregion of the spectrum (i.e., having a wavelength of between about 400nm and about 700 nm) such that the transmitted radiation is in thevisible region and can be detected either visually or by conventionalspectroscopic means. As used herein, an “ultraviolet dye” refers to amolecule having a chromophore that absorbs radiation in the ultravioletregion of the spectrum (i.e., having a wavelength of between about 30 nmand about 400 nm). As used herein, an “infrared dye” refers to amolecule having a chromophore that absorbs radiation in the infraredregion of the spectrum (i.e., having a wavelength between about 700 nmand about 3,000 nm). A “chromophore” is the network of atoms of the dyethat, when exposed to radiation, emits radiation at a level that isdetectable visually or via conventional spectroscopic means. One ofskill in the art will readily appreciate that although a dye absorbsradiation in one region of the spectrum, it may emit radiation inanother region of the spectrum. For example, an ultraviolet dye may emitradiation in the visible region of the spectrum. One of skill in the artwill also readily appreciate that a dye can transmit radiation or canemit radiation via fluorescence or phosphorescence.

As used herein, the term “effective amount” refers to the amount of acompound which is sufficient to (i) reduce or ameliorate theprogression, severity and/or duration of a disorder (e.g., a disorderassociated with, characterized by or caused by premature translationtermination and/or nonsense-mediated mRNA decay), or one or moresymptoms thereof, (ii) prevent the development, recurrence or onset of adisorder (e.g., a disorder associated with, characterized by or causedby premature translation termination and/or nonsense-mediated mRNAdecay), or one or more symptoms thereof, (iii) prevent the advancementof a disorder (e.g., a disorder associated with, characterized by orcaused by premature translation termination and/or nonsense-mediatedmRNA decay), or one or more symptoms thereof, or (iv) enhance or improvethe therapeutic effect(s) of another therapy.

As used herein, the term “fragment”, in the context of a protein orpolypeptide refers to a peptide sequence of at least 5 contiguousresidues, at least 10 contiguous residues, at least 15 contiguousresidues, at least 20 contiguous residues, at least 25 contiguousresidues, at least 40 contiguous residues, at least 50 contiguousresidues, at least 60 contiguous residues, at least 70 contiguousresidues, at least 80 contiguous residues, at least 90 contiguousresidues, at least 100 contiguous residues, at least 125 contiguousresidues, at least 150 contiguous residues, at least 175 contiguousresidues, at least 200 contiguous residues, or at least 250 contiguousresidues of the sequence of another protein or polypeptide. In aspecific embodiment, a fragment of a protein or polypeptide retains atleast one function of the protein or polypeptide.

As used herein, the term “fragment”, in the context of a nucleic acidsequence refers to a nucleotide sequence of at least 5 contiguous bases,at least 10 contiguous bases, at least 15 contiguous bases, at least 20contiguous bases, at least 25 contiguous bases, at least 40 contiguousbases, at least 50 contiguous bases, at least 60 contiguous bases, atleast 70 contiguous bases, at least 80 contiguous bases, at least 90contiguous bases, at least 100 contiguous bases, at least 125 contiguousbases, at least 150 contiguous bases, at least 175 contiguous bases, atleast 200 contiguous bases, or at least 250 contiguous bases of thesequence of another nucleic acid sequence. In a specific embodiment, afragment of a nucleic acid sequence retains at least one domain of thenucleic acid sequence.

As used herein, the term “in combination” refers to the use of more thanone therapy (e.g., prophylactic and/or therapeutic agents). The use ofthe term “in combination” does not restrict the order in which therapies(e.g., prophylactic and/or therapeutic agents) are administered to asubject with a disorder (e.g., a disorder associated with, characterizedby or caused by premature translation termination and/ornonsense-mediated mRNA decay). A first therapy (e.g., a prophylactic ortherapeutic agent such as a compound identified in accordance with themethods of the invention) can be administered prior to (e.g., 5 minutes,15 minutes, 30 minutes, 45 minutes, 1 hour, 2 hours, 4 hours, 6 hours,12 hours, 24 hours, 48 hours, 72 hours, 96 hours, 1 week, 2 weeks, 3weeks, 4 weeks, 5 weeks, 6 weeks, 8 weeks, or 12 weeks before),concomitantly with, or subsequent to (e.g., 5 minutes, 15 minutes, 30minutes, 45 minutes, 1 hour, 2 hours, 4 hours, 6 hours, 12 hours, 24hours, 48 hours, 72 hours, 96 hours, 1 week, 2 weeks, 3 weeks, 4 weeks,5 weeks, 6 weeks, 8 weeks, or 12 weeks after) the administration of asecond therapy (e.g., a prophylactic or therapeutic agent such as achemotherapeutic agent or a TNF-α antagonist) to a subject with adisorder (e.g., a disorder associated with, characterized by or causedby premature translation termination and/or nonsense-mediated mRNAdecay).

As used herein, a “label” or “detectable label” is a composition that isdetectable, either directly or indirectly, by spectroscopic,photochemical, biochemical, immunochemical, or chemical means. Forexample, useful labels include radioactive isotopes (e.g., ³²P, ³⁵S, and³H), dyes, fluorescent dyes, electron-dense reagents, enzymes and theirsubstrates (e.g., as commonly used in enzyme-linked immunoassays, e.g.alkaline phosphatase and horse radish peroxidase), biotin, streptavidin,digoxigenin, or haptens and proteins for which antisera or monoclonalantibodies are available. Moreover, a label or detectable moiety caninclude an “affinity tag” that, when coupled with the target nucleicacid and incubated with a compound or compound library, allows for theaffinity capture of the target nucleic acid along with molecules boundto the target nucleic acid. One skilled in the art will appreciate thatan affinity tag bound to the target nucleic acids has, by definition, acomplimentary ligand coupled to a solid support that allows for itscapture. For example, useful affinity tags and complimentary ligands orpartners include, but are not limited to, biotin-streptavidin,complimentary nucleic acid fragments (e.g., oligo dT-oligo dA, oligoT-oligo A, oligo dG-oligo dC, oligo G-oligo C), aptamer complexes,aptamers, or haptens and proteins for which antisera or monoclonalantibodies are available. The label or detectable moiety is typicallybound, either covalently, through a linker or chemical bound, or throughionic, van der Waals or hydrogen bonds to the molecule to be detected.

As used herein, a “library” in the context of compounds refers to aplurality of compounds with which a target nucleic acid molecule iscontacted. A library can be a combinatorial library, e.g., a collectionof compounds synthesized using combinatorial chemistry techniques, or acollection of unique chemicals of low molecular weight (less than 1000daltons) that each occupy a unique three-dimensional space.

As used herein, the terms “manage”, “managing” and “management” refer tothe beneficial effects that a subject derives from a therapy (e.g., aprophylactic or therapeutic agent) which does not result in a cure ofthe disorder, (e.g., a disorder associated with, characterized by orcaused by premature translation termination and/or nonsense-mediatedmRNA decay). In certain embodiments, a subject is administered one ormore therapies to “manage” a disease or disorder so as to prevent theprogression or worsening of the disease or disorder.

As used herein, the phrase “modulation of premature translationtermination and/or nonsense-mediated mRNA decay” refers to theregulation of gene expression by altering the level of nonsensesuppression. For example, if it is desirable to increase production of adefective protein encoded by a gene with a premature stop codon, i.e.,to permit read through of the premature stop codon of the disease geneso translation of the gene can occur, then modulation of prematuretranslation termination and/or nonsense-mediated mRNA decay entailsup-regulation of nonsense suppression. Conversely, if it is desirable topromote the degradation of an mRNA with a premature stop codon, thenmodulation of premature translation termination and/or nonsense-mediatedmRNA decays entails down-regulation of nonsense suppression.

As used herein, the terms “non-responsive” and “refractory” describepatients treated with a currently available therapy (e.g., prophylacticor therapeutic agent) for a disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay such as, e.g., cancer), which is notclinically adequate to relieve one or more symptoms associated with suchdisorder. Typically, such patients suffer from severe, persistentlyactive disease and require additional therapy to ameliorate the symptomsassociated with their disorder.

As used herein, “nonsense-mediated mRNA decay” refers to any mechanismthat mediates the decay of mRNAs containing a premature translationtermination codon.

As used herein, a “nonsense mutation” is a point mutation changing acodon corresponding to an amino acid to a stop codon.

As used herein, “nonsense suppression” refers to the inhibition orsuppression of premature translation termination and/ornonsense-mediated mRNA decay.

As used herein, the phrase “pharmaceutically acceptable salt(s)”includes but is not limited to salts of acidic or basic groups that maybe present in compounds identified using the methods of the presentinvention. Compounds that are basic in nature are capable of forming awide variety of salts with various inorganic and organic acids. Theacids that can be used to prepare pharmaceutically acceptable acidaddition salts of such basic compounds are those that form non-toxicacid addition salts, i.e., salts containing pharmacologically acceptableanions, including but not limited to sulfuric, citric, maleic, acetic,oxalic, hydrochloride, hydrobromide, hydroiodide, nitrate, sulfate,bisulfate, phosphate, acid phosphate, isonicotinate, acetate, lactate,salicylate, citrate, acid citrate, tartrate, oleate, tannate,pantothenate, bitartrate, ascorbate, succinate, maleate, gentisinate,fumarate, gluconate, glucaronate, saccharate, formate, benzoate,glutamate, methanesulfonate, ethanesulfonate, benzenesulfonate,p-toluenesulfonate and pamoate (i.e.,1,1′-methylene-bis-(2-hydroxy-3-naphthoate)) salts. Compounds thatinclude an amino moiety may form pharmaceutically or cosmeticallyacceptable salts with various amino acids, in addition to the acidsmentioned above. Compounds that are acidic in nature are capable offorming base salts with various pharmacologically or cosmeticallyacceptable cations. Examples of such salts include alkali metal oralkaline earth metal salts and, particularly, calcium, magnesium, sodiumlithium, zinc, potassium, and iron salts.

As used herein, the term “previously determined reference range” refersto a reference range for the readout of a particular assay. Eachlaboratory will establish its own reference range for each particularassay. In a preferred embodiment, at least one positive control and atleast one negative control are included in each batch of compoundsanalyzed.

As used herein, a “premature termination codon” or “premature stopcodon” refers to the occurrence of a stop codon instead of a codoncorresponding to an amino acid.

As used herein, “premature translation termination” refers to the resultof a mutation that changes a codon corresponding to an amino acid to astop codon.

As used herein, the terms “prevent”, “preventing” and “prevention” referto the prevention of the development, recurrence or onset of a disorder(e.g., a disorder associated with, characterized by or caused bypremature translation termination and/or nonsense-mediated mRNA decay)or one or more symptoms thereof resulting from the administration of oneor more compounds identified in accordance the methods of the inventionor the administration of a combination of such a compound and a knowntherapy for such a disorder.

As used herein, the terms “prophylactic agent” and “prophylactic agents”refer to any agent(s) which can be used in the prevention of a disorder(e.g., a disorder associated with, characterized by or caused bypremature translation termination and/or nonsense-mediated mRNA decay).In certain embodiments, the term “prophylactic agent” refers to acompound identified in the screening assays described herein. In certainother embodiments, the term “prophylactic agent” refers to an agentother than a compound identified in the screening assays describedherein which is known to be useful for, or has been or is currentlybeing used to prevent or impede the onset, development and/orprogression of a disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay) or one or more symptoms thereof.

As used herein, the phrase “prophylactically effective amount” refers tothe amount of a therapy (e.g., a prophylactic agent) which is sufficientto result in the prevention of the development, recurrence or onset ofone or more symptoms associated with a disorder (e.g., a disorderassociated with, characterized by or caused by premature translationtermination and/or nonsense-mediated mRNA decay).

As used herein, the term “purified,” in the context of a compound, e.g.a compound identified in accordance with the method of the invention,refers to a compound that is substantially free of chemical precursorsor other chemicals when chemically synthesized. In a specificembodiment, the compound is 60%, preferably 65%, 70%, 75%, 80%, 85%,90%, or 99% free of other, different compounds. In a preferredembodiment, a compound identified in accordance with the methods of theinvention is purified.

As used herein, the term “reporter gene” refers to a nucleotide sequenceencoding a protein, polypeptide or peptide that is readily detectableeither by its presence or activity. Any reporter gene well-known to oneof skill in the art may be used in reporter gene constructs to ascertainthe effect of a compound on premature translation termination.

As used herein, the term “small molecule” and analogous terms include,but are not limited to, peptides, peptidomimetics, amino acids, aminoacid analogs, polynucleotides, polynucleotide analogs, nucleotides,nucleotide analogs, organic or inorganic compounds (i.e., includingheterorganic and/or ganometallic compounds) having a molecular weightless than about 10,000 grams per mole, organic or inorganic compoundshaving a molecular weight less than about 5,000 grams per mole, organicor inorganic compounds having a molecular weight less than about 1,000grams per mole, organic or inorganic compounds having a molecular weightless than about 500 grams per mole, and salts, esters, and otherpharmaceutically acceptable forms of such compounds.

As used herein, the terms “subject” and “patient” are usedinterchangeably herein. The terms “subject” and “subjects” refer to ananimal, preferably a mammal including a non-primate (e.g., a cow, pig,horse, cat, dog, rat, and mouse) and a primate (e.g., a chimpanzee, amonkey such as a cynomolgous monkey and a human), and more preferably ahuman. In one embodiment, the subject is refractory or non-responsive tocurrent therapies for a disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay). In another embodiment, the subject is afarm animal (e.g., a horse, a cow, a pig, etc.) or a pet (e.g., a dog ora cat). In a preferred embodiment, the subject is a human.

As used herein, the term “synergistic” refers to a combination of acompound identified using one of the methods described herein, andanother therapy (e.g., a prophylactic or therapeutic agent), whichcombination is more effective than the additive effects of thetherapies. A synergistic effect of a combination of therapies (e.g.,prophylactic or therapeutic agents) permits the use of lower dosages ofone or more of the therapies and/or less frequent administration of saidtherapies to a subject with disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay). The ability to utilize lower dosages of atherapy (e.g., a prophylactic or therapeutic agent) and/or to administersaid therapy less frequently reduces the toxicity associated with theadministration of said therapy to a subject without reducing theefficacy of said therapies in the prevention, treatment, management oramelioration of a disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay). In addition, a synergistic effect canresult in improved efficacy of therapies (e.g., prophylactic ortherapeutic agents) in the prevention, treatment, management oramelioration of a disorder (e.g., a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay). Finally, a synergistic effect of acombination of therapies (e.g., prophylactic or therapeutic agents) mayavoid or reduce adverse or unwanted side effects associated with the useof either therapy alone.

As used herein, the term “substantially one type of compound” means thatthe assay can be performed in such a fashion that at some point, onlyone compound need be used in each reaction so that, if the result isindicative of a binding event occurring between the target RNA moleculeand the compound, the compound can be easily identified.

As used herein, a “target nucleic acid” refers to RNA, DNA, or achemically modified variant thereof. In a preferred embodiment, thetarget nucleic acid is RNA. A target nucleic acid also refers totertiary structures of the nucleic acids, such as, but not limited toloops, bulges, pseudoknots, guanosine quartets and turns. A targetnucleic acid also refers to RNA elements such as, but not limited to,28S rRNA and structural analogs thereof, which are described in Sections5.1 and 5.2. Non-limiting examples of target nucleic acids are presentedin Sections 5.1 and 5.2.

As used herein, a “target RNA” refers to RNA or a chemically modifiedvariant thereof. A target RNA also refers to tertiary structures of RNA,such as, but not limited to loops, bulges, pseudoknots, guanosinequartets and turns. A target RNA also refers to RNA elements such as,but not limited to, 28S rRNA and structural analogs thereof, which aredescribed in Sections 5.1 and 5.2. Non-limiting examples of target RNAsare presented in Sections 5.1 and 5.2. In a specific embodiment, atarget RNA is at least 25 nucleotides, preferably at least 30nucleotides, at least 35 nucleotides, at least 40 nucleotides, at least45 nucleotides, at least 50 nucleotides, at least 55 nucleotides, atleast 60 nucleotides, at least 65 nucleotides, at least 70 nucleotides,at least 75 nucleotides, at least 80 nucleotides, at least 85nucleotides, at least 90 nucleotides, at least 95 nucleotides, at least100 nucleotides, at least 125 nucleotides, at least 150 nucleotides, atleast 175 nucleotides or at least 200 nucleotides in length.

As used herein, the terms “therapeutic agent” and “therapeutic agents”refer to any agent(s) which can be used in the prevention, treatment,management or amelioration of one or more symptoms of a disorder (e.g.,a disorder associated with, characterized by or caused by prematuretranslation termination and/or nonsense-mediated mRNA decay). In certainembodiments, the term “therapeutic agent” refers to a compoundidentified in the screening assays described herein. In otherembodiments, the term “therapeutic agent” refers to an agent other thana compound identified in the screening assays described herein which isknown to be useful for, or has been or is currently being used toprevent, treat, manage or ameliorate a disorder (e.g., a disorderassociated with, characterized by or caused by premature translationtermination and/or nonsense-mediated mRNA decay) or one or more symptomsthereof.

As used herein, the term “therapeutically effective amount” refers tothat amount of a therapy (e.g., a therapeutic agent) sufficient toresult in (i) the amelioration of one or more symptoms of a disorder(e.g., a disorder associated with, characterized by or caused bypremature translation termination and/or nonsense-mediated mRNA decay),(ii) prevent advancement of a disorder (e.g., a disorder associatedwith, characterized by or caused by premature translation terminationand/or nonsense-mediated mRNA decay), (iii) cause regression of adisorder (e.g., a disorder associated with, characterized by or causedby premature translation termination and/or nonsense-mediated mRNAdecay), or (iv) to enhance or improve the therapeutic effect(s) ofanother therapy (e.g., therapeutic agent).

As used herein, the terms “treat”, “treatment” and “treating” refer tothe reduction or amelioration of the progression, severity and/orduration of a disorder (e.g., a disorder associated with, characterizedby or caused by premature translation termination and/ornonsense-mediated mRNA decay) or one or more symptoms thereof resultingfrom the administration of one or more compounds identified inaccordance the methods of the invention, or a combination of one or morecompounds identified in accordance with the invention and anothertherapy.

As used herein, the terms “therapy” and “therapies” refer to any method,protocol and/or agent that can be used in the prevention, treatment,management or amelioration of a disesase or disorder or one or moresymptoms thereof. In certain embodiments, such terms refer tochemotherapy, radiation therapy, surgery, supportive therapy and/orother therapies useful in the prevention, treatment, management oramelioration of a disease or disorder or one or more symptoms thereofknown to skilled medical personnel.

4. DESCRIPTION OF DRAWINGS

FIG. 1. The human 28S rRNA. Domains II and V are circled.

FIG. 2. Gel retardation analysis to detect peptide-RNA interactions. In20 μl reactions containing 50 pmole end-labeled TAR RNA oligonucleotide,increasing concentrations of Tat₄₇₋₅₈ peptide (0.1 uM, 0.2 uM, 0.4 uM,0.8 uM, 1.6 uM) was added in TK buffer. The reaction mixture was thenheated at 90° C for 2 min and allowed to cool slowly to 24° C. 10 μl of30% glycerol was added to each sample and applied to a 12%non-denaturing polyacrylamide gel. The gel was electrophoresed using1200 volt-hours at 4° C. in TBE Buffer. Following electrophoresis, thegel was dried and the radioactivity was quantitated with aphosphorimager. The concentration of peptide added is indicated aboveeach lane.

FIG. 3. Gentamicin interacts with an oligonucleotide corresponding tothe 16S rRNA. 20 μl reactions containing increasing concentrations ofgentamicin (1 ng/ml, 10 ng/ml, 100 ng/ml, 1 μg/ml, 10 μg/ml, 50 μg/ml,500 μg/ml) were added to 50 pmole end-labeled RNA oligonucleotide in TKMbuffer, heated at 90° C. for 2 min and allowed to cool slowly to 24° C.10 μl of 30% glycerol was added to each sample and the samples wereapplied to a 13.5% non-denaturing polyacrylamide gel. The gel waselectrophoresed using 1200 volt-hours at 4° C. in TBE Buffer. Followingelectrophoresis, the gel was dried and the radioactivity was quantitatedusing a phosphorimager. The concentration of gentamicin added isindicated above each lane.

FIG. 4. The presence of 10 pg/ml gentamicin produces a gel mobilityshift in the presence of the 16S rRNA oligonucleotide. 20 μl reactionscontaining increasing concentrations of gentamicin (100 ng/ml, 10 ng/ml,1 ng/ml, 100 pg/ml, and 10 pg/ml) were added to 50 pmole end-labeled RNAoligonucleotide in TKM buffer were treated as described for FIG. 3.

FIG. 5. Gentamicin binding to the 16S rRNA oligonucleotide is weak inthe absence of MgCl₂. Reaction mixtures containing gentamicin (1 μg/ml,100 μg/ml, 10 μg/ml, 1 μg/ml, 0.1 [2g/ml, and 10 ng/ml) were treated asdescribed in FIG. 3 except that the TKM buffer does not contain MgCl₂.

FIG. 6. Gel retardation analysis to detect peptide-RNA interactions. Inreactions containing increasing concentrations of Tat₄₇₋₅₈ peptide (0.1mM, 0.2 mM, 0.4 mM, 0.8 mM, 1.6 mM) 50 pmole TAR RNA oligonucleotide wasadded in TK buffer. The reaction mixture was then heated at 90° C. for 2min and allowed to cool slowly to 24° C. The reactions were loaded ontoa SCE9610 automated capillary electrophoresis apparatus (SpectruMedix;State College, Pennsylvania). The peaks correspond to the amount of freeTAR RNA (“TAR”) or the Tat-TAR complex (“Tat-TAR”). The concentration ofpeptide added is indicated below each lane.

FIG. 7. Small molecules involved in nonsense suppression alter thechemical footprinting pattern in Domain V of the 28S rRNA. 100 pmol ofribosomes were incubated with 100 μM compound, followed by treatmentwith the chemical modifying agents kethoxal (KE) and dimethyl sulfate(DMS, not shown). Following chemical modification, rRNA was prepared andanalyzed in primer extension reactions using end-labeledoligonucleotides hybridizing to rRNA A sequencing reaction was run inparallel as a marker.

FIG. 8. Small molecules involved in nonsense suppression alter thechemical footprinting pattern in Domain V of the 28S rRNA. 100 pmol ofribosomes were incubated with 100 μM compound, followed by treatmentwith the chemical modifying agents kethoxal (KE) and dimethyl sulfate(DMS, not shown). Following chemical modification, rRNA was prepared andanalyzed in primer extension reactions using end-labeledoligonucleotides hybridizing to rRNA. A sequencing reaction was run inparallel as a marker.

FIG. 9. Small molecules involved in nonsense suppression alter thechemical footprinting pattern in Domain II (GTPase Center) of the 28SrRNA. 100 pmol of ribosomes were incubated with 100 μM compound,followed by treatment with the chemical modifying agents kethoxal (KE)and dimethyl sulfate (DMS, not shown). Following chemical modification,rRNA was prepared and analyzed in primer extension reactions usingend-labeled oligonucleotides hybridizing to rRNA. A sequencing reactionwas run in parallel as a marker.

FIG. 10. Small molecules involved in nonsense suppression alter thechemical footprinting pattern of domain II of the 28S rRNA. 100 pmol ofribosomes were incubated with 100 μM compound, followed by treatmentwith chemical modifying agents dimethyl sulfate (DMS) and kethoxal (KE).Following chemical modification, rRNA was prepared and analyzed inprimer extension reactions using end-labeled oligonucleotideshybridizing to rRNA. A sequencing reaction was run in parallel as amarker.

FIG. 11. A specific region of Domain II can compete for compound bindingand prevents nonsense suppression in vitro. The in vitro nonsensesuppression assay was performed using a luciferase construct with a UGAnonsense mutation. 0.1 mM compound was present in the reaction to inducenonsense suppression. Competitor RNA corresponding to Domain II wasadded at the indicated concentrations (0, 1, 2.5, 5, 7.5, 10 pM) totitrate the small molecule and prevent nonsense suppression.

5. DETAILED DESCRIPTION OF THE INVENTION

The present invention provides methods for identifying compounds thatmodulate translation termination and/or nonsense-mediated mRNA decay byidentifying compounds that bind to preselected target elements ofnucleic acids including, but not limited to, specific RNA sequences, RNAstructural motifs, and/or RNA structural elements. In particular, thepresent invention provides methods of identifying compounds that bind toregions of the 28S rRNA and analogs thereof. The specific target RNAsequences, RNA structural motifs, and/or RNA structural elements (i.e.,regions of the 28S rRNA and analogs thereof) are used as targets forscreening small molecules and identifying those that directly bind thesespecific sequences, motifs, and/or structural elements. For example,methods are described in which a preselected target RNA having adetectable label or method of detection is used to screen a library ofcompounds, preferably under physiologic conditions; and any complexesformed between the target RNA and a member of the library are identifiedusing physical methods that detect the labeled or altered physicalproperty of the target RNA bound to a compound. Further, methods aredescribed in which a preselected target RNA is used to screen a libraryof compounds, with each compound in the library having a detectablelabel or method of detection, preferably under physiologic conditions;and any complexes formed between the target RNA and a member of thelibrary are identified using physical methods that detect the labeled oraltered physical property of the compound bound to target RNA.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA having a detectable label with a library of compounds free insolution, and detecting the formation of a target RNA:compound complex.In particular, the present invention provides methods for identifyingcompounds that bind to a target RNA (e.g., regions or fragments of 28SrRNA or RNA containing a premature stop codon), said methods comprisingcontacting a target RNA having a detectable label with a library ofcompounds free in solution, in, e.g., labeled tubes or a microtiterplate, and detecting the formation of a target RNA:compound complex.Compounds in the library that bind to the labeled target RNA will form adetectably labeled complex. The detectably labled complex can then beidentified and removed from the uncomplexed, unlabeled complex, and fromuncomplexed, labeled target RNA, by a variety of methods, including, butnot limited to, methods that differentiate changes in theelectrophoretic, chromatographic, or thermostable properties of thecomplexed target RNA. Such methods include, but are not limited to,electrophoresis, fluorescence spectroscopy, surface plasmon resonance,mass spectrometry, scintillation proximity assay, structure-activityrelationships (“SARS”) by NMR spectroscopy, size exclusionchromatography, affinity chromatography, and nanoparticle aggregation.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA with a library of compounds, wherein each compound in thelibrary is detectably labeled, and detecting the formation of a targetRNA:compound complex. In particular, the present invention providesmethods for identifying compounds that bind to a target RNA (e.g.,regions or fragments of 28S rRNA, or RNA containing a premature stopcodon), said methods comprising contacting a target RNA with a libraryof compounds free in solution, in, e.g., labeled tubes or a microtiterplate, wherein each compound in the library is detectably labeled, anddetecting the formation of a target RNA:compound complex. Compounds inthe library that bind to the labeled target RNA will form a detectablylabeled complex. The detectably labled complex can then be identifiedand removed from the uncomplexed, unlabeled complex, and fromuncomplexed, target RNA, by a variety of methods, including, but notlimited to, methods that differentiate changes in the electrophoretic,chromatographic, or thermostable properties of the complexed target RNA.Such methods include, but are not limited to, electrophoresis,fluorescence spectroscopy, surface plasmon resonance, mass spectrometry,scintillation proximity assay, structure-activity relationships (“SARS”)by NMR spectroscopy, size exclusion chromatography, affinitychromatography, and nanoparticle aggregation.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions or fragments of 28S rRNA or RNAcontaining a premature stop codon), said methods comprising contacting atarget RNA having a detectable label with a library of compounds bound,wherein each compound in the library is attached to a solid support; anddetecting the formation of a target RNA:compound complex. In particular,the present invention provides methods for identifying compounds thatbind to a target RNA (in particular, regions of 28S rRNA or RNAcontaining a premature stop codon), said method comprising contacting atarget RNA having a detectable label with a library of compounds,wherein each compound is attached to a solid support (e.g., a bead-basedlibrary of compounds or a microarray of compounds), and detecting theformation of a target RNA:compound complex. Compounds in the librarythat bind to the labeled target RNA will form a detectably labeledcomplex. Compounds in the library that bind to the labeled target RNAwill form a solid support (e.g., a bead-based) detectably labeledcomplex, which can be separated from the unbound beads and unboundtarget RNA in the liquid phase by a number of physical means, including,but not limited to, flow cytometry, affinity chromatography, manualbatch mode separation, suspension of beads in electric fields, andmicrowave of the bead-based detectably labeled complex.

The present invention provides methods for identifying compounds thatbind to a target RNA (e.g., regions of 28S rRNA or RNA containing apremature stop codon), said methods comprising contacting a target RNAattached or conjugated to a solid support with a library of compounds,wherein each compound in the library is detectably labeled, anddetecting the formation of a target RNA:compound complex. Target RNAmolecules that bind to labeled compounds will form a detectable labeledcomplex. Target RNA molecules that bind to labeled compounds will form asolid support-based detectably labeled complex, which can be separatedfrom the unbound solid support-target RNA and unbound compounds in theliquid phase by a number of means, including, but not limited to, flowcytometry, affinity chromatography, manual batch mode separation,suspension of beads in electric fields, and microwave of the bead-baseddetectably labeled complex.

Thus, the methods of the present invention provide a simple, sensitiveassay for high-throughput screening of libraries of compounds, in whichthe compounds of the library that specifically bind a preselected targetnucleic acid are easily distinguished from non-binding members of thelibrary. In one embodiment, the structures of the binding molecules aredeciphered from the input library by methods depending on the type oflibrary that is used. In another embodiment, the structures of thebinding molecules are ascertained by de novo structure determination ofthe compounds using, for example, mass spectrometry or nuclear magneticresonance (“NMR”). The compounds so identified are useful for anypurpose to which a binding reaction may be put, for example in assaymethods, diagnostic procedures, cell sorting, as inhibitors of targetmolecule function, as probes, as sequestering agents and lead compoundsfor development of therapeutics, and the like. Small organic compoundsthat are identified to interact specifically with the target RNAmolecules are particularly attractive candidates as lead compounds forthe development of therapeutic agents.

The assays of the invention reduce bias introduced by competitivebinding assays which require the identification and use of a host cellfactor (presumably essential for modulating RNA function) as a bindingpartner for the target RNA. The assays of the present invention aredesigned to detect any compound or agent that binds to 28S rRNA,preferably under physiologic conditions. Such agents can then be testedfor biological activity, without establishing or guessing which hostcell factor or factors is required for modulating the function and/oractivity of 28S rRNA.

5.1 28S rRNA and Analogs Thereof

The ribosome is a 2.5-MDa ribonucleoprotein complex involved in thedecoding of genetic material from mRNA to proteins. A combination ofbiophysical and biochemical analysis have provided three dimensionalmodels of the ribosome as well as detailed analyses into the mechanismof the individual steps in translation (see, e.g., Green & Noller, 1997,Annu. Rev. Biochem. 66:679-716; Cate et al., 1999, Science285(5436):2095-2104; and Ban et al., 2000, Science.289(5481): 905-920.).

The 28S rRNA is one of the ribosomal RNA components of the 60S subunitof eukaryotic ribosomes. The 28S rRNA sequences are conserved whenexpressed as mature rRNAs, although the 28S rRNA contains variablesequence tracts that are interspersed among conserved core sequences andlacking in the counterpart bacterial 23S rRNA (see, e.g., Hancock &Dover, 1988, Mol. Biol. Evol. 5:377-391). A diagram of the 28S rRNA ispresented in FIG. 1, with domains II and V circled. As indicated in FIG.1, a GTPase center has been mapped to domain II and the peptidyltransferase center has been mapped to domain V.

Compounds that interact in these regions or modulate local changeswithin these domains of the ribosome (e.g., alter base pairinginteractions, base modification or modulate binding of trans-actingfactors that bind to these regions) have the potential to modulatetranslation termination. These regions, i.e., domains II and V areconserved from prokaryotes to eukaryotes, but the role of these regionsin modulating translation termination has not been realized ineukaryotes. In bacteria, when a short RNA fragment, complementary to theE. coli 23S rRNA segment comprising nucleotides 735 to 766 (in domainII), is expressed in vivo, suppression of UGA nonsense mutations, butnot UAA or UAG, results (Chernyaeva et al., 1999, J Bacteriol181:5257-5262). Other regions of the 23S rRNA in E. coli have beenimplicated in nonsense suppression including the GTPase center in domainII (nt 10341120; Jemiolo et al., 1995, Proc. Nat. Acad. Sci.92:12309-12313).

Genetic mutations in bacteria have also identified rRNA mutations thateither increase the level of frameshifting in the trpE or thesuppression of a nonsense mutations in the trpA gene (reviewed in Green& Noller, 1997, Annu. Rev. Biochem. 66:679-716). The frameshiftingmutations mapped to domains IV and V of the 23S rRNA. Disruption of theinteraction of the CCA end of the tRNA with the peptidyl transferasecenter of the ribosome has been demonstrated to result in an increasedtranslational error frequency (reviewed in Green & Noller, 1997, Annu.Rev. Biochem. 66:679-716).

Regions of the 28S rRNA involved in frameshifting, nonsense mutationsuppression, GTPase activity, or peptidyl transferase are attractivetarget RNAs to identify compounds that modulate premature translationtermination and/or nonsense mediated decay. The interference of acompound with one or more of these functions could potentially mediatetranslation termination by interfering with premature translationtermination. Without being bound by theory, a compound could potentiallymediate translation termination by causing read through of a prematuretranslation codon, therefore allowing the synthesis of the full-lengthprotein.

In a preferred embodiment, the target RNA comprises a region of 28S rRNAcorresponding to domain II (see, e.g., nucleotides 1310 to 2333 ofaccession number M11167) or domain V of 28S rRNA (see, e.g., nucleotides3859 to 4425 of accession number M11167) or an analog thereof It willbecome apparent to one of skill in the art that an analog of the 28SrRNA has an analogous structure and function to native 28S rRNA. Forexample, an analog of human 28S rRNA includes, but is not limited to, ahuman 28S rRNA retropseudogene (see, e.g., Wang et al., 1997, Gene196:105-111, Accession Number L20636). Regions corresponding to domainII or domain V of the 28S rRNA pseudogene could be used as target RNAsin the present invention. In a preferred embodiment, the 28S rRNA is ahuman 28S rRNA, although the teachings of the present invention areapplicable to mammals.

Synthesis of the target RNAs, i.e., regions of 28S rRNA, can beperformed by methods known to one of skill in the art (see, e.g.,Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 2d Ed.,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York andGlover, D. M. (ed.), 1985, DNA Cloning: A Practical Approach, MRL Press,Ltd., Oxford, U.K. Vol. I, II). In a preferred embodiment, the targetRNAs are cloned as DNAs downstream of a promoter, such as but notlimited to T7, T3, or Sp6 promoters, and i71 vitro transcribed with thecorresponding polymerase. A detectable label can be incorporated intothe in vitro transcribed RNA or alternatively, the target RNA isend-labeled (see Section 5.3 infra). Alternatively, the target RNA canbe amplified by polymerase chain reaction with a primer containing anRNA promoter and subsequently in vitro transcribed, as described in U.S.Pat. No.6,271,002, which is incorporated by reference in its entirety.

5.2 Stop Codon Containing Target RNA

The present invention provides for methods for screening and identifyingcompounds that modulate premature translation termination and/ornonsense-mediated mRNA decay. A target RNA may be engineered to containa premature stop codon or, alternatively, a target RNA may naturallycontain a premature stop codon. The premature stop codon may any one ofthe stop codons known in the art including UAG, UAA and UGA.

The stop codons are UAG, UAA, and UGA, i.e., signals to the ribosome toterminate protein synthesis, presumably through protein release factors.Even though the use of these stop codons is widespread, they are notuniversal. For example, UGA specifies tryptophan in the mitochondria ofmammals, yeast, Neurospora crassa, Drosophila, protozoa, and plants(see, e.g., Breitenberger & RajBhandary, 1985, Trends Biochem Sci10:481). Other examples include the use of UGA for tryptophan inMycoplasma and, in ciliated protozoa, the use of UAA and UAG forglutamine (see, e.g., Jukes et al., 1987, Cold Spring Harb Symp QuantBiol. 52:769-776), the use of UGA for cysteine in the ciliate Euplotesaediculatus (see, e.g., Kervestin et al., 2001, EMBO Rep August 2001;2(8):680-684), the use of UGA for tryptophan in Blepharisma americanumand the use of UAR for glutamine in Tetrahymena, and three spirotrichs,Stylonychia lemnae, S. mytilus, and Oxytricha trifallax (see, e.g.,Lozupone et al., 2001, Curr Biol 11(2):65-74). It has been proposed thatthe ancestral mitochondrion was bearing the universal genetic code andsubsequently reassigned the UGA codon to tryptophan independently, atleast in the lineage of ciliates, kinetoplastids, rhodophytes,prymnesiophytes, and fungi (see, e.g., Inagaki et al., 1998, J Mol Evol47(4):378-384).

The readthrough of stop codons also occurs in positive-sense ssRNAviruses by a variety of naturally occurring suppressor tRNAs. Suchnaturally-occurring suppressor tRNAs include, but are not limited to,cytoplasmic tRNATyr, which reads through the UAG stop codon; cytoplasmictRNAsGln, which read through UAG and UAA; cytoplasmic tRNAsLeu, whichread through UAG; chloroplast and cytoplasmic tRNAsTrp, which readthrough UGA; chloroplast and cytoplasmic tRNAsCys, which read throughUGA; cytoplasmic tRNAsArg, which read through UGA (see, e.g. Beier &Grimm, 2001, Nucl Acids Res 29(23):4767-4782 for a review); and the useof selenocysteine to suppress UGA in E. coli (see, e.g., Baron & Böck,1995, The selenocysteine inserting tRNA species: structure and functionIn SöllD. and RajBhandary, U.L. (eds), tRNA: Structure, Biosynthesis andFunction, ASM Press, Washington, D.C., pp.529 544). The mechanism isthought to involve unconventional base interactions and/or codon contexteffects.

As described above, the stop codons are not necessarily universal, withconsiderable variation amongst organelles (e.g., mitochondria andchloroplasts), viruses (e.g., single strand viruses), and protozoa(e.g., ciliated protozoa) as to whether the codons UAG, UAA, and UGAsignal translation termination or encode amino acids. Even though asingle release factor most probably recognizes all of the stop codons ineukaryotes, it appears that all of the stop codons are not suppressed ina similar matter. For example, in the yeast Saccharomyces pombe,nonsense suppression has to be strictly codon specific (see, e.g.,Hottinger et al., 1984, EMBO J 3:423-428). In another example,significant differences were found in the degree of suppression amongstthree UAG codons and two UAA codons in different mRNA contexts inEscherichia coli and in human 293 cells, although data suggested thatthe context effects of nonsense suppression operated differently in E.coli and human cells (see, e.g., Martin et al., 1989, Mol Gen Genet217(2 3):411 8). Since unconventional base interactions and/or codoncontext effects have been implicated in nonsense suppression, it isconceivable that compounds involved in nonsense suppression of one stopcodon may not necessarily be involved in nonsense suppression of anotherstop codon. In other words, compounds involved in suppressing UAG codonsmay not necessarily be involved in suppressing UGA codons.

In a specific embodiment, a target RNA contains or is engineered tocontain the premature stop codon UAG. In another embodiment, a targetRNA contains or is engineered to contain the premature stop codon UGA.

In a particular embodiment, a target RNA contains or is engineered tocontain two or more stop codons. In accordance with this embodiment, thestop codons are preferably at least 10 nucleotides, at least 15nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, atleast 45 nucleotides, at least 50 nucleotides, at least 75 nucleotidesor at least 100 nucleotides apart from each other. Further, inaccordance with this embodiment, at least one of the stop codons ispreferably UAG or UGA.

In a specific embodiment, a target RNA contains or is engineered tocontain a premature stop codon at least 15 nucleotides, preferably atleast 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides,at least 35 nucleotides, at least 40 nucleotides, at least 45nucleotides, at least 50 nucleotides or at least 75 nucleotides from thestart codon in the coding sequence. In another embodiment, a target RNAcontains or is engineered to contain a premature stop codon at least 15nucleotides, preferably at least 25 nucleotides, at least 50nucleotides, at least 75 nucleotides, at least 100 nucleotides, at least125 nucleotides, at least 150, at least 175 nucleotides or at least 200nucleotides from the native stop codon in the coding sequence of thefull-length protein, polypeptide or peptide. In another embodiment, atarget RNA contains or is engineered to contain a premature stop codonat least 15 nucleotides (preferably at least 20 nucleotides, at least 25nucleotides, at least 30 nucleotides, at least 35 nucleotides, at least40 nucleotides, at least 45 nucleotides, at least 50 nucleotides or atleast 75 nucleotides) from the start codon in the coding sequence and atleast 15 nucleotides (preferably at least 25 nucleotides, at least 50nucleotides, at least 75 nucleotides, at least 100 nucleotides, at least125 nucleotides, at least 150, at least 175 nucleotides or at least 200nucleotides) from the native stop codon in the coding sequence of thefull-length protein, polypeptide or peptide. In accordance with theseembodiments, the premature stop codon is preferably UAG or UGA.

The premature translation stop codon can be produced by in vitromutagenesis techniques such as, but not limited to, polymerase chainreaction (“PCR”), linker insertion, oligonucleotide-mediatedmutagenesis, and random chemical mutagenesis.

5.3 Target RNAs (Detectably Labeled or Attached to a Solid Support)

Target nucleic acids, including but not limited to RNA and DNA, usefulin the methods of the present invention have a label that is detectablevia conventional spectroscopic means or radiographic means. Preferably,target nucleic acids are labeled with a covalently attached dyemolecule. Useful dye-molecule labels include, but are not limited to,fluorescent dyes, phosphorescent dyes, ultraviolet dyes, infrared dyes,and visible dyes. Preferably, the dye is a visible dye.

Useful labels in the present invention can include, but are not limitedto, spectroscopic labels such as fluorescent dyes (e.g., fluorescein andderivatives such as fluorescein isothiocyanate (FITC) and Oregon Green™,rhodamine and derivatives (e.g., Texas red, tetramethylrhodimineisothiocynate (TRITC), bora-3a,4a-diaza-s-indacene (BODIPY®) andderivatives, etc.), digoxigenin, biotin, phycoerythrin, AMCA, CyDye™,and the like), radiolabels (e.g., ³H, ¹²⁵I, ³⁵S, ¹⁴C, ³²P, ³³P, etc.),enzymes (e.g., horse radish peroxidase, alkaline phosphatase etc.),spectroscopic colorimetric labels such as colloidal gold or coloredglass or plastic (e.g. polystyrene, polypropylene, latex, etc.) beads,or nanoparticles—nanoclusters of inorganic ions with defined dimensionfrom 0.1 to 1000 nm. Useful affinity tags and complimentary partnersinclude, but are not limited to, biotin-streptavidin, complimentarynucleic acid fragments (e.g., oligo dT-oligo dA, oligo T-oligo A, oligodG-oligo dC, oligo G-oligo C), aptamer-streptavidin, or haptens andproteins for which antisera or monoclonal antibodies are available. Thelabel may be coupled directly or indirectly to a component of thedetection assay (e.g., the detection reagent) according to methods wellknown in the art. A wide variety of labels may be used, with the choiceof label depending on sensitivity required, ease of conjugation with thecompound, stability requirements, available instrumentation, anddisposal provisions.

In one embodiment, nucleic acids that are labeled at one or morespecific locations are chemically synthesized using phosphoramidite orother solution or solid-phase methods. Detailed descriptions of thechemistry used to form polynucleotides by the phosphoramidite method arewell known (see, e.g., Caruthers et al., U.S. Pat. Nos. 4,458,066 and4,415,732; Caruthers et al., 1982, Genetic Engineering 4:1-17; UsersManual Model 392 and 394 Polynucleotide Synthesizers, 1990, pages 6-1through 6-22, Applied Biosystems, Part No. 901237; Ojwang, et al., 1997,Biochemistry, 36:6033-6045). The phosphoramidite method ofpolynucleotide synthesis is the preferred method because of itsefficient and rapid coupling and the stability of the startingmaterials. The synthesis is performed with the growing polynucleotidechain attached to a solid support, such that excess reagents, which aregenerally in the liquid phase, can be easily removed by washing,decanting, and/or filtration, thereby eliminating the need forpurification steps between synthesis cycles.

The following briefly describes illustrative steps of a typicalpolynucleotide synthesis cycle using the phosphoramidite method. First,a solid support to which is attached a protected nucleoside monomer atits 3′ terminus is treated with acid, e.g., trichloroacetic acid, toremove the 5′-hydroxyl protecting group, freeing the hydroxyl group fora subsequent coupling reaction. After the coupling reaction is completedan activated intermediate is formed by contacting the support-boundnucleoside with a protected nucleoside phosphoramidite monomer and aweak acid, e.g., tetrazole. The weak acid protonates the nitrogen atomof the phosphoramidite forming a reactive intermediate. Nucleosideaddition is generally complete within 30 seconds. Next, a capping stepis performed, which terminates any polynucleotide chains that did notundergo nucleoside addition. Capping is preferably performed usingacetic anhydride and 1-methylimidazole. The phosphite group of theinternucleotide linkage is then converted to the more stablephosphotriester by oxidation using iodine as the preferred oxidizingagent and water as the oxygen donor. After oxidation, the hydroxylprotecting group of the newly added nucleoside is removed with a proticacid, e.g., trichloroacetic acid or dichloroacetic acid, and the cycleis repeated one or more times until chain elongation is complete. Aftersynthesis, the polynucleotide chain is cleaved from the support using abase, e.g., ammonium hydroxide or t-butyl amine. The cleavage reactionalso removes any phosphate protecting groups, e.g., cyanoethyl. Finally,the protecting groups on the exocyclic amines of the bases and anyprotecting groups on the dyes are removed by treating the polynucleotidesolution in base at an elevated temperature, e.g., at about 55° C.Preferably the various protecting groups are removed using ammoniumhydroxide or t-butyl amine.

Any of the nucleoside phosphoramidite monomers can be labeled usingstandard phosphoramidite chemistry methods (Hwang et al., 1999, Proc.Natl. Acad. Sci. USA 96(23):12997-13002; Ojwang et al., 1997,Biochemistry. 36:6033-6045 and references cited therein). Dye moleculesuseful for covalently coupling to phosphoramidites preferably comprise aprimary hydroxyl group that is not part of the dye's chromophore.Illustrative dye molecules include, but are not limited to, disperse dyeCAS 4439-31-0, disperse dye CAS 6054-58-6, disperse dye CAS 4392-69-2(Sigma-Aldrich, St. Louis, Mo.), disperse red, and 1-pyrenebutanol(Molecular Probes, Eugene, Oreg.). Other dyes useful for coupling tophosphoramidites will be apparent to those of skill in the art, such asfluoroscein, cy3, and cy5 fluorescent dyes, and may be purchased from,e.g., Sigma-Aldrich, St. Louis, Mo. or Molecular Probes, Inc., Eugene,Oreg.

In another embodiment, dye-labeled target molecules are synthesizedenzymatically using in? vitro transcription (Hwang et al., 1999, Proc.Natl. Acad. Sci. USA 96(23):12997-13002 and references cited therein).In this embodiment, a mixture of ribonucleoside-5′-triphosphates capableof supporting template-directed enzymatic extension (e.g., a mixtureincluding GTP, ATP, CTP, and UTP, including one or more dye-labeledribonucleotides; Sigma-Aldrich, St. Louis, Mo.) is added to apromoter-containing DNA template. Next, a polymerase enzyme is added tothe mixture under conditions where the polymerase enzyme is active,which are well-known to those skilled in the art. A labeledpolynucleotide is formed by the incorporation of the labeledribonucleotides during polymerase-mediated strand synthesis.

In yet another embodiment of the invention, nucleic acid molecules areend-labeled after their synthesis. Methods for labeling the 5′-end of anoligonucleotide include but are by no means limited to: (i) periodateoxidation of a 5′-to-5′-coupled ribonucleotide, followed by reactionwith an amine-reactive label (Heller & Morisson, 1985, in RapidDetection and Identification of Infectious Agents, D. T. Kingsbury andS. Falkow, eds., pp. 245-256, Academic Press); (ii) condensation ofethylenediamine with 5′-phosphorylated polynucleotide, followed byreaction with an amine reactive label (Morrison, European PatentApplication 232 967); (iii) introduction of an aliphatic aminesubstituent using an aminohexyl phosphite reagent in solid-phase DNAsynthesis, followed by reaction with an amine reactive label (Cardulloet al., 1988, Proc. Natl. Acad. Sci. USA 85:8790-8794); and (iv)introduction of a thiophosphate group on the 5′-end of the nucleic acid,using phosphatase treatment followed by end-labeling with ATP-?S andkinase, which reacts specifically and efficiently with maleimide-labeledfluorescent dyes (Czworkowski et al., 1991, Biochem. 30:4821-4830).

A detectable label should not be incorporated into a target nucleic acidat the specific binding site at which compounds are likely to bind,since the presence of a covalently attached label might interferesterically or chemically with the binding of the compounds at this site.Accordingly, if the region of the target nucleic acid that binds to ahost cell factor is known, a detectable label is preferably incorporatedinto the nucleic acid molecule at one or more positions that arespatially or sequentially remote from the binding region.

After synthesis, the labeled target nucleic acid can be purified usingstandard techniques known to those skilled in the art (see Hwang et al.,1999, Proc. Natl. Acad. Sci. USA 96(23):12997-13002 and references citedtherein). Depending on the length of the target nucleic acid and themethod of its synthesis, such purification techniques include, but arenot limited to, reverse-phase high-performance liquid chromatography(“reverse-phase HPLC”), fast performance liquid chromatography (“FPLC”),and gel purification. After purification, the target RNA is refoldedinto its native conformation, preferably by heating to approximately85-95° C. and slowly cooling to room temperature in a buffer, e.g., abuffer comprising about 50 mM Tris-HCl, pH 8 and 100 mM NaCl.

In another embodiment, the target nucleic acid can also be radiolabeled.A radiolabel, such as, but not limited to, an isotope of phosphorus,sulfur, or hydrogen, may be incorporated into a nucleotide, which isadded either after or during the synthesis of the target nucleic acid.Methods for the synthesis and purification of radiolabeled nucleic acidsare well known to one of skill in the art. See, e.g., Sambrook et al.,1989, in Molecular Cloning: A Laboratory Manual, pp 10.2-10.70, ColdSpring Harbor Laboratory Press, and the references cited therein, whichare hereby incorporated by reference in their entireties.

In another embodiment, the target nucleic acid can be attached to aninorganic nanoparticle. A nanoparticle is a cluster of ions withcontrolled size from 0.1 to 1000 nm comprised of metals, metal oxides,or semiconductors including, but not limited to Ag₂S, ZnS, CdS, CdTe,Au, or TiO₂. Nanoparticles have unique optical, electronic and catalyticproperties relative to bulk materials which can be adjusted according tothe size of the particle. Methods for the attachment of nucleic acidsare well known to one of skill in the art (see, e.g., Niemeyer, 2001,Angew. Chem. Int. Ed. 40: 4129-4158, International Patent PublicationWO/0218643, and the references cited therein, the disclosures of whichare hereby incorporated by reference in their entireties).

In yet another embodiment of the invention, target nucleic acids can beattached or conjugated to a solid support for use in the assays of theinvention. There are a number of methods, known in the art, that can beused to immobilize nucleic acids on a solid support. For example,modified DNA has been covalently immobilized to a variety of surfacesusing amino acids (see, e.g., Running, J. A., and Urdea, M. S. (1990)Biotechniques, 8, 276-277), (Newton, C. R., et al., (1993) Nucl. Acids.Res., 21 1155-1162.), (Nikiforov, T. T., and Rogers, Y. H. (1995) Anal.Biochem., 227, 201-209). Alternatively, carboxyl groups, (Zhang, Y., etal., (1991) Nucl. Acids Res., 19, 3929-3933), epoxy groups (Lamture, J.B., et al., (1994) Nucl. Acids Res. 22, 2121-2125), (Eggers, M. D., etal., (1994) BioTechniques, 17, 516-524) or amino groups (Rasmussen, S.R,. et al., (1991) Anal. Biochem., 198, 138-142), can be used to attachnucleic acids to solid surfaces. Such embodiments would be useful in,e.g., high throughput assays intended to screen a library of compoundsin order to identify molecules that bind to target nucleic acids thathave been attached to a solid support. In a particular embodiment,target RNA molecules are attached or conjugated to a solid support,e.g., a slide or a bead, using an appropriate molecule that does notinterfere with its binding to compounds of the invention and thensubsequently screened with a library of compounds. Members of a libraryof compounds are preferably detectably labeled so that compounds thatbind to target RNAs can be identified. Suitable detectable labels thatcan be used to label compounds are known in the art and also describedherein. In a more preferred embodiment, target RNA molecules areimmobilized on a surface suitable for preforming microarray assays. Anytechnique known in the art can be used to immobilize nucleic acidmolecules on a solid support surface. The nucleic acid is preferably,for example, covalently attached to the solid support.

5.4 Libraries of Small Molecules

Libraries screened using the methods of the present invention cancomprise a variety of types of compounds. In one embodiment, thelibraries screened using the methods of the present invention cancomprise a variety of types of compounds on solid supports. In otherembodiments described below, the libraries can be synthesized on solidsupports or the compounds of the library can be attached to solidsupports by linkers. In some embodiments, the compounds are nucleic acidor peptide molecules. In a non-limiting example, peptide molecules canexist in a phage display library. In other embodiments, types ofcompounds include, but are not limited to, peptide analogs includingpeptides comprising non-naturally occurring amino acids, e.g., D-aminoacids, phosphorous analogs of amino acids, such as α-amino phosphoricacids and α-amino phosphoric acids, or amino acids having non-peptidelinkages, nucleic acid analogs such as phosphorothioates and PNAs,hormones, antigens, synthetic or naturally occurring drugs, opiates,dopamine, serotonin, catecholamines, thrombin, acetylcholine,prostaglandins, organic molecules, pheromones, adenosine, sucrose,glucose, lactose and galactose. Libraries of polypeptides or proteinscan also be used.

In a preferred embodiment, the combinatorial libraries are small organicmolecule libraries, such as, but not limited to, benzodiazepines,isoprenoids, thiazolidinones, metathiazanones, pyrrolidines, morpholinocompounds, and diazepindiones. In another embodiment, the combinatoriallibraries comprise peptoids; random bio-oligomers; benzodiazepines;diversomers such as hydantoins, benzodiazepines and dipeptides;vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates;peptidyl phosphonates; peptide nucleic acid libraries; antibodylibraries; or carbohydrate libraries. Combinatorial libraries arethemselves commercially available (see, e.g., Advanced ChemTech EuropeLtd., Cambridgeshire, UK; ASINEX Moscow Russia; BioFocus plc,Sittingbourne, UK; Bionet Research (A division of Key Organics Limited),Camelford, UK; ChemBridge Corporation, San Diego, Calif.; ChemDiv Inc,San Diego, Calif.; ChemRx Advanced Technologies, South San Francisco,Calif.; ComGenex Inc., Budapest, Hungary; Evotec OAI Ltd, Abingdon, UK;IF LAB Ltd., Kiev, Ukraine; Maybridge plc, Cornwall, UK; PharmaCore,Inc., North Carolina; SIDDCO Inc, Tucson, Ariz.; TimTec Inc., Newark,Del.; Tripos Receptor Research Ltd, Bude, UK; Toslab, Ekaterinburg,Russia). In a specific embodiment, the combinatorial libaries are smallmolecules.

In another embodiment, combinatorial libraries, useful in the present-invention are combinatorial libraries of labeled compounds with eachcompound in the library having a label that is detectable viaconventional spectroscopic means or radiographic means. Preferably,compounds are labeled with a covalently attached and detectable isotope.Other useful labels in the present invention include, but are notlimited to, fluorescent tags or dye molecules. Useful dye molecules,include, for example. fluorescent dyes, phosphorescent dyes, ultravioletdyes, infrared dyes, and visible dyes. Useful fluorescent tags, include,for example, fluoresein and derivatives such as fluoresceinisothiocyanate (FITC) and Oregon Green™, rhodamine and derivatives(e.g., Texas red, tetramethylrhodimine isothiocynate (TRITC),bora-3a,4a-diaza-s-indacene (BODIPY®) and derivatives, etc.),digoxigenin, biotin, phycoerythrin, AMCA, CyDye™, and the like),radiolabels (e.g., ³H, ¹²⁵I, ³⁵S, ¹⁴C, ³²P, ³³P, etc.), enzymes (e.g.,horse radish peroxidase, alkaline phosphatase etc.), spectroscopiccolorimetric labels such as colloidal gold or colored glass or plastic(e.g. polystyrene, polypropylene, latex, etc.) beads, ornanoparticles—nanoclusters of inorganic ions with defined dimension from0.1 to 1000 nm. The label may be coupled directly or indirectly to acomponent of the detection assay (e.g. the detection reagent) accordingto methods well known in the art. A wide variety of labels may be used,with the choice of label depending on sensitivity required, ease ofconjugation with the compound, stability requirements, availableinstrumentation, and disposal provisions.

In one embodiment, the combinatorial compound library for the methods ofthe present invention may be synthesized. There is a great interest insynthetic methods directed toward the creation of large collections ofsmall organic compounds, or libraries, which could be screened forpharmacological, biological or other activity (Dolle, 2001, J. Comb.Chem. 3:477-517; Hall et al., 2001, J. Comb. Chem. 3:125-150; Dolle,2000, J. Comb. Chem. 2:383A433; Dolle, 1999, J. Comb. Chem. 1:235-282).The synthetic methods applied to create vast combinatorial libraries areperformed in solution or in the solid phase, i.e., on a solid support.Solid-phase synthesis makes it easier to conduct multi-step reactionsand to drive reactions to completion with high yields because excessreagents can be easily added and washed away after each reaction step.Solid-phase combinatorial synthesis also tends to improve isolation,purification and screening. However, the more traditional solution phasechemistry supports a wider variety of organic reactions than solid-phasechemistry. Methods and strategies for the synthesis of combinatoriallibraries can be found in A Practical Guide to Combinatorial Chemistry,A. W. Czarnik and S. H. Dewitt, eds., American Chemical Society, 1997;The Combinatorial Index, B. A. Bunin, Academic Press, 1998; OrganicSynthesis on Solid Phase, F. Z. Dörwald, Wiley-VCH, 2000; andSolid-Phase Organic Syntheses, Vol. 1, A. W. Czarnik, ed., WileyInterscience, 2001.

Combinatorial compound libraries of the present invention may besynthesized using apparatuses described in U.S. Pat. No. 6,358,479 toFrisina et al., U.S. Pat. No. 6,190,619 to Kilcoin et al., U.S. Pat. No.6,132,686 to Gallup et al., U.S. Pat. No. 6,126,904 to Zuellig et al.,U.S. Pat. No. 6,074,613 to Harness et al., U.S. Pat. No. 6,054,100 toStanchfield et al., and U.S. Pat. No. 5,746,982 to Saneii et al. whichare hereby incorporated by reference in their entirety. These patentsdescribe synthesis apparatuses capable of holding a plurality ofreaction vessels for parallel synthesis of multiple discrete compoundsor for combinatorial libraries of compounds.

In one embodiment, the combinatorial compound library can be synthesizedin solution. The method disclosed in U.S. Pat. No. 6,194,612 to Boger etal., which is hereby incorporated by reference in its entirety, featurescompounds useful as templates for solution phase synthesis ofcombinatorial libraries. The template is designed to permit reactionproducts to be easily purified from unreacted reactants usingliquid/liquid or solid/liquid extractions. The compounds produced bycombinatorial synthesis using the template will preferably be smallorganic molecules. Some compounds in the library may mimic the effectsof non-peptides or peptides. In contrast to solid-phase synthesis ofcombinatorial compound libraries, liquid-phase synthesis does notrequire the use of specialized protocols for monitoring the individualsteps of a multistep solid-phase synthesis (Egner et al., 1995, J. Org.Chem. 60:2652; Anderson et al., 1995, J. Org. Chem. 60:2650; Fitch etal., 1994, J. Org. Chem. 59:7955; Look et al., 1994, J. Org. Chem.49:7588; Metzger et al., 1993, Angew. Chem., Int. Ed. Engl. 32:894;Youngquist et al., 1994, Rapid Commun Mass Spect. 8:77; Chu et al.,1995, J. Am. Chem. Soc. 117:5419; Brummel et al., 1994, Science 264:399;Stevanovic et al., 1993, Bioorg. Med. Chem. Lett. 3:431).

Combinatorial compound libraries useful for the methods of the presentinvention can be synthesized on solid supports. In one embodiment, asplit synthesis method, a protocol of separating and mixing solidsupports during the synthesis, is used to synthesize a library ofcompounds on solid supports (see Lam et al., 1997, Chem. Rev. 97:41-448;Ohlmeyer et al., 1993, Proc. Natl. Acad. Sci. USA 90:10922-10926 andreferences cited therein). Each solid support in the final library hassubstantially one type of compound attached to its surface. Othermethods for synthesizing combinatorial libraries on solid supports,wherein one product is attached to each support, will be known to thoseof skill in the art (see, e.g., Nefzi et al., 1997, Chem. Rev.97:449-472 and U.S. Pat. No. 6,087,186 to Cargill et al. which arehereby incorporated by reference in their entirety).

As used herein, the term “solid support” is not limited to a specifictype of solid support. Rather a large number of supports are availableand are known to one skilled in the art. Solid supports that can be usedin the assays of the invention include, for example, any surface towhich compounds, either natively or via a linker, can be attached. Solidsupports include silica gels, resins, derivatized plastic films, glassbeads, glass slides (e.g., Hergenrother et al., 2000, J. Am. Chem. Soc.122:7849-7850 and Kuruvilla et al., 2002, Nature 416:653-657) andcotton, plastic beads, polystyrene beads, doped polystyrene beads (asdescribed by Fenniri et al., 2000, J. Am. Chem. Soc. 123:8151-8152),polystyrene macrobeads (as described by Blackwell et al., 2001,Chemistry & Biology 8:1167-1182), alumina gels, and polysaccharides. Ina specific embodiment, the solid support is a glass slide. In a morespecific embodiment, the solid support is a glass microscope slide.

A suitable solid support may be selected on the basis of desired end useand suitability for various synthetic protocols. For example, forpeptide synthesis, a solid support can be a resin such asp-methylbenzhydrylamine (pMBHA) resin (Peptides International,Louisville, Ky.), polystyrenes (e.g., PAM-resin obtained from BachemInc., Peninsula Laboratories, etc.), including chloromethylpolystyrene,hydroxymethylpolystyrene and aminomethylpolystyrene, poly(dimethylacrylamide)-grafted styrene co-divinyl-benzene (e.g., POLYHIPEresin, obtained from Aminotech, Canada), polyamide resin (obtained fromPeninsula Laboratories), polystyrene resin grafted with polyethyleneglycol (e.g., TENTAGEL or ARGOGEL, Bayer, Tubingen, Germany)polydimethylacrylamide resin (obtained from Milligen/Biosearch,California), or Sepharose (Pharmacia, Sweden). In another embodiment,the solid support can be a magnetic bead coated with streptavidin, suchas Dynabeads Streptavidin (Dynal Biotech, Oslo, Norway).

In one embodiment, the solid phase support is suitable for ill vivo use,i.e., it can serve as a carrier or support for administration of thecompound to a patient (e.g., TENTAGEL, Bayer, Tubingen, Germany). In aparticular embodiment, the solid support is palatable and/or orallyingestable.

Any technique known to one of skill in the art can be used attachcompounds to a solid support for use in the assays of the invention. Insome embodiments of the present invention, compounds can be attached tosolid supports via linkers. Linkers can be integral and part of thesolid support, or they may be nonintegral that are either synthesized onthe solid support or attached thereto after synthesis. Linkers areuseful not only for providing points of compound attachment to the solidsupport, but also for allowing different groups of molecules to becleaved from the solid support under different conditions, depending onthe nature of the linker. For example, linkers can be, inter alia,electrophilically cleaved, nucleophilically cleaved, photocleavable,enzymatically cleaved, cleaved by metals, cleaved under reductiveconditions or cleaved under oxidative conditions.

In some embodiments of the present invention, each compound contains acommon functional group that mediates covalent attachment to a solidsupport. In a specific embodiment of the invention, the functional groupthat mediates covalent attachment to a solid support varies between thecompounds. Compounds can be attached on a solid support in anyorientation and distribution that is suitable for the assays of theinvention. In a further embodiment, compounds are attached or spotted ona solid support such as, e.g., a glass slide, with high spatial densityand uniform distance between each spot so that an array is formed. Eachsurface is subsequently probed with a compound of interest.

In one embodiment, compounds are applied directly to a surface, such as,e.g., a glass slide, using a manual transfer technique. In a particularembodiment, the compounds are transferred or spotted on a surface from amicrotiter plate using a robotic arrayer. In another embodiment, thecompounds are attached to beads that are subsequently transferred towells in a microtiter plate where the compounds are released beforebeing arrayed on a surface using any of the means described above. Anytype and size of bead can be used to attach compounds of the invention.One skilled in the art would be familiar with the bead propertiesnecessary for a specific purpose. In a particular embodiment, the beadmaterial is polystyrene.

In another embodiment, the combinatorial compound libraries can beassembled in situ using dynamic combinatorial chemistry as described inEuropean Patent Application 1,118,359 A1 to Lehn; Huc & Nguyen, 2001,Comb. Chem. High Throughput. Screen. 4:53-74; Lehn and Eliseev, 2001,Science 291:2331-2332; Cousins et al. 2000, Curr. Opin. Chem. Biol. 4:270-279; and Karan & Miller, 2000, Drug. Disc. Today 5:67-75 which areincorporated by reference in their entirety.

Dynamic combinatorial chemistry uses non-covalent interaction with atarget biomolecule, including but not limited to a protein, RNA, or DNA,to favor assembly of the most tightly binding molecule that is acombination of constituent subunits present as a mixture in the presenceof the biomolecule. According to the laws of thermodynamics, when acollection of molecules is able to combine and recombine at equilibriumthrough reversible chemical reactions in solution, molecules, preferablyone molecule, that bind most tightly to a templating biomolecule will bepresent in greater amount than all other possible combinations. Thereversible chemical reactions include, but are not limited to, imine,acyl-hydrazone, amide, acetal, or ester formation betweencarbonyl-containing compounds and amines, hydrazines, or alcohols; thiolexchange between disulfides; alcohol exchange in borate esters;Diels-Alder reactions; thermal- or photoinduced sigmatropic orelectrocyclic rearrangements; or Michael reactions.

In the preferred embodiment of this technique, the constituentcomponents of the dynamic combinatorial compound library are allowed tocombine and reach equilibrium in the absence of the target RNA and thenincubated in the presence of the target RNA, preferably at physiologicalconditions, until a second equilibrium is reached. The second,perturbed, equilibrium (the so-called “templated mixture”) can, but neednot necessarily, be fixed by a further chemical transformation,including but not limited to reduction, oxidation, hydrolysis,acidification, or basification, to prevent restoration of the originalequilibrium when the dynamical combinatorial compound library isseparated from the target RNA.

In the preferred embodiment of this technique, the predominant productor products of the templated dynamic combinatorial library can separatedfrom the minor products and directly identified. In another embodiment,the identity of the predominant product or products can be identified bya deconvolution strategy involving preparation of derivative dynamiccombinatorial libraries, as described in European Patent Application1,118,359 A1, which is incorporated by reference in its entirety,whereby each component of the mixture is, preferably one-by-one butpossibly group-wise, left out of the mixture and the ability of thederivative library mixture at chemical equilibrium to bind the targetRNA is measured. The components whose removal most greatly reduces theability of the derivative dynamic combinatorial library to bind thetarget RNA are likely the components of the predominant product orproducts in the original dynamic combinatorial library.

5.5 Library Screening

After a target nucleic acid, such as but not limited to RNA or DNA, islabeled and a compound library is synthesized or purchased or both, thelabeled target nucleic acid is used to screen the library to identifycompounds that bind to the nucleic acid. Screening comprises contactinga labeled target nucleic acid with an individual, or small group, of thecompounds of the compound library. Preferably, the contacting occurs inan aqueous solution, and most preferably, under physiologic conditions.The aqueous solution preferably stabilizes the labeled target nucleicacid and prevents denaturation or degradation of the nucleic acidwithout interfering with binding of the compounds. The aqueous solutioncan be similar to the solution in which a complex between the target RNAand its corresponding host cell factor (if known) is formed in vitro.For example, TK buffer, which is commonly used to form Tat protein-TARRNA complexes in vitro, can be used in the methods of the invention asan aqueous solution to screen a library of compounds for RNA bindingcompounds.

Alternatively, compounds are labeled and target RNA molecules are usedto screen the library of labeled compounds. After compounds are labeled,target nucleic acids are used to screen the library of labeled compoundsto identify those nucleic acids that bind to the labeled compounds.Screening comprises contacting a target nucleic acid with an individual,or small group, of the compounds of the labeled compound library.Preferably, the contacting occurs in an aqueous solution, and mostpreferably under physiologic conditions. The aqueous solution preferablystabilizes the target nucleic acid and prevents denaturation ordegradation of the nucleic acid without interfering with binding of thecompounds. The aqueous solution can be similar to the solution in whicha complex between the target RNA and its corresponding host cell factor(if known) is formed in vitro. For example, TK buffer, which is commonlyused to form Tat protein-TAR RNA complexes in vitro, can be used in themethods of the invention as an aqueous solution to screen a library ofcompounds for RNA binding compounds.

The methods of the present invention for screening a library ofcompounds preferably comprise contacting a compound with a targetnucleic acid in the presence of an aqueous solution, the aqueoussolution comprising a buffer and a combination of salts, preferablyapproximating or mimicking physiologic conditions. The aqueous solutionoptionally further comprises non-specific nucleic acids, such as, butnot limited to, DNA; yeast tRNA; salmon sperm DNA; homoribopolymers suchas, but not limited to, poly IC, polyA, polyU, and polyC; andnon-specific RNA. The non-specific RNA may be an unlabeled targetnucleic acid having a mutation at the binding site, which renders theunlabeled nucleic acid incapable of interacting with a compound at thatsite. For example, if dye-labeled TAR RNA is used to screen a library,unlabeled TAR RNA having a mutation in the uracil 23/cytosine 24 bulgeregion may also be present in the aqueous solution. Without being boundby any theory, the addition of unlabeled RNA that is essentiallyidentical to the dye-labeled target RNA except for a mutation at thebinding site might minimize interactions of other regions of thedye-labeled target RNA with compounds or with the solid support andprevent false positive results.

The solution further comprises a buffer, a combination of salts, andoptionally, a detergent or a surfactant. The pH of the aqueous solutiontypically ranges from about 5 to about 8, preferably from about 6 toabout 8, most preferably from about 6.5 to about 8. A variety of buffersmay be used to achieve the desired pH. Suitable buffers include, but arenot limited to, Tris, Mes, Bis-Tris, Ada, Aces, Pipes, Mopso, Bis-Trispropane, Bes, Mops, Tes, Hepes, Dipso, Mobs, Tapso, Trizma, Heppso,Popso, TEA, Epps, Tricine, Gly-Gly, Bicine, and sodium-potassiumphosphate. The buffering agent comprises from about 10 mM to about 100mM, preferably from about 25 mM to about 75 mM, most preferably fromabout 40 mM to about 60 mM buffering agent. The pH of the aqeuoussolution can be optimized for different screening reactions, dependingon the target RNA used and the types of compounds in the library, andtherefore, the type and amount of the buffer used in the solution canvary from screen to screen. In a preferred embodiment, the aqueoussolution has a pH of about 7.4, which can be achieved using about 50 mMTris buffer.

In addition to an appropriate buffer, the aqueous solution furthercomprises a combination of salts, from about 0 mM to about 100 mM KCl,from about 0 mM to about 1 M NaCl, and from about 0 mM to about 200 mMMgCl₂. In a preferred embodiment, the combination of salts is about 100mM KCl, 500 mM NaCl, and 10 mM MgCl₂. Without being bound by any theory,Applicant has found that a combination of KCl, NaCl, and MgCl₂stabilizes the target RNA such that most of the RNA is not denatured ordigested over the course of the screening reaction. The optionalconcentration of each salt used in the aqueous solution is dependent onthe particular target RNA used and can be determined using routineexperimentation.

The solution optionally comprises from about 0.01% to about 0.5% (w/v)of a detergent or a surfactant. Without being bound by any theory, asmall amount of detergent or surfactant in the solution might reducenon-specific binding of the target RNA to the solid support and controlaggregation and increase stability of target RNA molecules. Typicaldetergents useful in the methods of the present invention include, butare not limited to, anionic detergents, such as salts of deoxycholicacid, 1-heptanesulfonic acid, N-laurylsarcosine, lauryl sulfate,1-octane sulfonic acid and taurocholic acid; cationic detergents such asbenzalkonium chloride, cetylpyridinium, methylbenzethonium chloride, anddecamethonium bromide; zwitterionic detergents such as CHAPS, CHAPSO,alkyl betaines, alkyl amidoalkyl betaines,N-dodecyl-N,N-dimethyl-3-ammonio-1-propanesulfonate, andphosphatidylcholine; and non-ionic detergents such as n-decyla-D-glucopyranoside, n-decyl β-D-maltopyranoside, n-dodecylβ-D-maltoside, n-octyl β-D-glucopyranoside, sorbitan esters,n-tetradecyl β-D-maltoside, octylphenoxy polyethoxyethanol (NonidetP-40), nonylphenoxypolyethoxyethanol (NP-40), and tritons. Preferably,the detergent, if present, is a nonionic detergent. Typical surfactantsuseful in the methods of the present invention include, but are notlimited to, ammonium lauryl sulfate, polyethylene glycols, butylglucoside, decyl glucoside, Polysorbate 80, lauric acid, myristic acid,palmitic acid, potassium palmitate, undecanoic acid, lauryl betaine, andlauryl alcohol. More preferably, the detergent, if present, is TritonX-100 and present in an amount of about 0.1% (w/v).

Non-specific binding of a labeled target nucleic acid to compounds canbe further minimized by treating the binding reaction with one or moreblocking agents. Non-specific binding of a unlabeled target nucleic acidto labeled compounds can be further minimized by treating the bindingreaction with one or more blocking agents. In one embodiment, thebinding reactions are treated with a blocking agent, e.g., bovine serumalbumin (“BSA”), before contacting with to the labeled target nucleicacid. In another embodiment, the binding reactions are treatedsequentially with at least two different blocking agents. This blockingstep is preferably performed at room temperature for from about 0.5 toabout 3 hours. In a subsequent step, the reaction mixture is furthertreated with unlabeled RNA having a mutation at the binding site. Thisblocking step is preferably performed at about 4° C. for from about 12hours to about 36 hours before addition of the dye-labeled target RNA.Preferably, the solution used in the one or more blocking steps issubstantially similar to the aqueous solution used to screen the librarywith the dye-labeled target RNA, e.g., in pH and salt concentration.

Once contacted, the mixture of labeled target nucleic acid and thecompound is preferably maintained at 4° C. for from about 1 day to about5 days, preferably from about 2 days to about 3 days with constantagitation. To identify the reactions in which binding to the labeledtarget nucleic acid occurred, after the incubation period, bound fromfree compounds are determined using any of the methods disclosed inSection 5.5 infra. In a specific embodiment, the complexed targetnucleic acid does not need to be separated from the free target nucleicacid if a technique (i.e., spectrometry) that diferentiates betweenbound and unbound target nucleic acids is used.

In another embodiment, once contacted, the mixture of target nucleicacid and the labeled compound is preferably maintained at 4° C. for fromabout 1 day to about 5 days, preferably from about 2 days to about 3days with constant agitation. To identify the reactions in which bindingto the target nucleic acid occurred, after the incubation period, boundfrom free compounds are determined using any of the methods disclosed inSection 5.5 infra. In a specific embodiment, the complexed targetnucleic acid does not need to be separated from the free target nucleicacid if a technique (i.e., spectrometry) that diferentiates betweenbound and unbound target nucleic acids is used.

The methods for identifying small molecules bound to labeled nucleicacid will vary with the type of label on the target nucleic acid. Forexample, if a target RNA is labeled with a visible of fluorescent dye,the target RNA complexes are preferably identified using achromatographic technique that separates bound from free target by anelectrophoretic or size differential technique using individualreactions. The reactions corresponding to changes in the migration ofthe complexed RNA can be cross-referenced to the small moleculecompound(s) added to said reaction. Alternatively, complexed target RNAcan be screened en masse and then separated from free target RNA usingan electrophoretic or size differential technique, the resultantcomplexed target is then analyzed using a mass spectrometric technique.In this fashion the bound small molecule can be identified on the basisof its molecular weight. In this reaction a priori knowledge of theexact molecular weights of all compounds within the library is known Inanother embodiment, the compounds bound to the target nucleic acid maynot require separation from the unbound target nucleic acid if atechnique such as, but not limited to, spectrometry is used.

The methods for identifying labeled small molecules bound to unlabelednucleic acid will vary with the type of label on the compound. Forexample, if compound is labeled with a visible of fluorescent dye, thetarget RNA complexes are preferably identified using a chromatographictechnique that separates bound from free target by an electrophoretic orsize differential technique using individual reactions. The reactionscorresponding to changes in the migration of the complexed RNA can becross-referenced to the small molecule compound(s) added to saidreaction. Alternatively, complexed target RNA can be screened en masseand then separated from free target RNA using an electrophoretic or sizedifferential technique, the resultant complexed target is then analyzedusing a mass spectrometric technique. In this fashion the bound smallmolecule can be identified on the basis of its molecular weight. In thisreaction a priori knowledge of the exact molecular weights of allcompounds within the library is known. In another embodiment, thecompounds bound to the target nucleic acid may not require separationfrom the unbound target nucleic acid if a technique such as, but notlimited to, spectrometry is used.

5.6 Separation Methods for Screening Compounds

Any method that detects an altered physical property of a target nucleicacid complexed to a compound from the unbound target nucleic acid may beused for separation of the complexed and non-complexed target nucleicacids. Methods that can be utilized for the physical separation ofcomplexed target RNA from unbound target RNA include, but are notlimited to, electrophoresis, fluorescence spectroscopy, surface plasmonresonance, mass spectrometry, scintillation, proximity assay,structure-activity relationships (“SAR”) by NMR spectroscopy, sizeexclusion chromatography, affinity chromatography, nanoparticleaggregation, flow cytometry, manual batch, and suspension of beads inelectric fields.

In embodiments that use solid support based methods, after the labeledtarget RNA is contacted with the library of compounds immobilized on asolid support (e.g., beads) or the target RNA conjugated or attached tothe solid support is contacted with the library of detectably labeledcompounds, the solid support (e.g., beads) must then be separated fromthe unbound target RNA or unbound compounds, respectively, in the liquidphase. This can be accomplished by any number of physical means; e.g.,sedimentation, centrifugation. Thereafter, a number of methods can beused to separate the solid support-based library that is complexed withthe labeled target RNA from uncomplexed beads in order to isolate thecompound on the bead. Alternatively, mass spectroscopy and NMRspectroscopy can be used to simultaneously identify and separate beadscomplexed to the labeled target RNA from uncomplexed beads.

5.6.1 Electrophoresis

Methods for separation of the complex of a target RNA bound to acompound from the unbound RNA comprises any method of electrophoreticseparation, including but not limited to, denaturing and non-denaturingpolyacrylamide gel electrophoresis, urea gel electrophoresis, gelfiltration, pulsed field gel electrophoresis, two dimensional gelelectrophoresis, continuous flow electrophoresis, zone electrophoresis,agarose gel electrophoresis, and capillary electrophoresis.

In a preferred embodiment, an automated electrophoretic systemcomprising a capillary cartridge having a plurality of capillary tubesis used for high-throughput screening of compounds bound to target RNA.Such an apparatus for performing automated capillary gel electrophoresisis disclosed in U.S. Pat. Nos. 5,885,430; 5,916,428; 6,027,627; and6,063,251, the disclosures of which are incorporated by reference intheir entireties.

The device disclosed in U.S. Pat. No. 5,885,430, which is incorporatedby reference in its entirety, allows one to simultaneously introducesamples into a plurality of capillary tubes directly from microtitertrays having a standard size. U.S. Pat. No. 5,885,430 discloses adisposable capillary cartridge which can be cleaned betweenelectrophoresis runs, the cartridge having a plurality of capillarytubes. A first end of each capillary tube is retained in a mountingplate, the first ends collectively forming an array in the mountingplate. The spacing between the first ends corresponds to the spacingbetween the centers of the wells of a microtiter tray having a standardsize. Thus, the first ends of the capillary tubes can simultaneously bedipped into the samples present in the tray's wells. The cartridge isprovided with a second mounting plate in which the second ends of thecapillary tubes are retained. The second ends of the capillary tubes arearranged in an array which corresponds to the wells in the microtitertray, which allows for each capillary tube to be isolated from itsneighbors and therefore free from cross-contamination, as each end isdipped into an individual well.

Plate holes may be provided in each mounting plate and the capillarytubes inserted through these plate holes. In such a case, the plateholes are sealed airtight so that the side of the mounting plate havingthe exposed capillary ends can be pressurized. Application of a positivepressure in the vicinity of the capillary openings in this mountingplate allows for the introduction of air and fluids duringelectrophoretic operations and also can be used to force out gel andother materials from the capillary tubes during reconditioning. Thecapillary tubes may be protected from damage using a needle comprising acannula and/or plastic tubes, and the like when they are placed in theseplate holes. When metallic cannula or the like are used, they can serveas electrical contacts for current flow during electrophoresis. In thepresence of a second mounting plate, the second mounting plate isprovided with plate holes through which the second ends of the capillarytubes project. In this instance, the second mounting plate serves as apressure containment member of a pressure cell and the second ends ofthe capillary tubes communicate with an internal cavity of the pressurecell. The pressure cell is also formed with an inlet and an outlet.Gels, buffer solutions, cleaning agents, and the like may be introducedinto the internal cavity through the inlet, and each of these cansimultaneously enter the second ends of the capillaries.

In another preferred embodiment, the automated electrophoretic systemcan comprise a chip system consisting of complex designs ofinterconnected channels that perform and analyze enzyme reactions usingpart of a channel design as a tiny, continuously operatingelectrophoresis material, where reactions with one sample are going onin one area of the chip while electrophoretic separation of the productsof another sample is taking place in a different part of the chip. Sucha system is disclosed in U.S. Pat. Nos. 5,699,157; 5,842,787; 5,869,004;5,876,675; 5,942,443; 5,948,227; 6,042,709; 6,042,710; 6,046,056;6,048,498; 6,086,740; 6,132,685; 6,150,119; 6,150,180; 6,153,073;6,167,910; 6,171,850; and 6,186,660, the disclosures of which areincorporated by reference in their entireties.

The system disclosed in U.S. Pat. No. 5,699,157, which is herebyincorporated by reference in its entirety, provides for a microfluidicsystem for high-speed electrophoretic analysis of subject materials forapplications in the fields of chemistry, biochemistry, biotechnology,molecular biology and numerous other areas. The system has a channel ina substrate, a light source and a photoreceptor. The channel holdssubject materials in solution in an electric field so that the materialsmove through the channel and separate into bands according to species.The light source excites fluorescent light in the species bands and thephotoreceptor is arranged to receive the fluorescent light from thebands. The system further has a means for masking the channel so thatthe photoreceptor can receive the fluorescent light only at periodicallyspaced regions along the channel. The system also has an unit connectedto analyze the modulation frequencies of light intensity received by thephotoreceptor so that velocities of the bands along the channel aredetermined, which allows the materials to be analyzed.

The system disclosed in U.S. Pat. No. 5,699,157 also provides for amethod of performing high-speed electrophoretic analysis of subjectmaterials, which comprises the steps of holding the subject materials insolution in a channel of a microfluidic system; subjecting the materialsto an electric field so that the subject materials move through thechannel and separate into species bands; directing light toward thechannel; receiving light from periodically spaced regions along thechannel simultaneously, and analyzing the frequencies of light intensityof the received light so that velocities of the bands along the channelcan be determined for analysis of said materials. The determination ofthe velocity of a species band determines the electrophoretic mobilityof the species and its identification.

U.S. Pat. No. 5,842,787, which is hereby incorporated by reference inits entirety, is generally directed to devices and systems employchannels having, at least in part, depths that are varied over thosewhich have been previously described (such as the device disclosed inU.S. Pat. No. 5,699,157), wherein said channel depths provide numerousbeneficial and unexpected results such as but not limited to, areduction in sample perturbation, reduced non-specific sample mixture bydiffusion, and increased resolution.

In another embodiment, the electrophoretic method of separationcomprises polyacrylamide gel electrophoresis. In a preferred embodiment,the polyacrylamide gel electrophoresis is non-denaturing, so as todifferentiate the mobilities of the target RNA bound to a compound fromfree target RNA. If the polyacrylamide gel electrophoresis isdenaturing, then the target RNA:compound complex must be cross-linkedprior to electrophoresis to prevent the disassociation of the target RNAfrom the compound during electrophoresis. Such techniques are well knownto one of skill in the art.

In one embodiment of the method, the binding of compounds to targetnucleic acid can be detected, preferably in an automated fashion, by gelelectrophoretic analysis of interference footprinting. RNA can bedegraded at specific base sites by enzymatic methods such asribonucleases A, U₂, CL₃, T₁, Phy M, and B. cereus or chemical methodssuch as diethylpyrocarbonate, sodium hydroxide, hydrazine, piperidineformate, dimethyl sulfate,[2,12-dimethyl-3,7,11,17-tetraazacyclo[11.3.1]heptadeca-1(17),2,11,13,15-pentaenato]nickel(II)(NiCR), cobalt(II)chloride, or iron(II) ethylenediaminetetraacetate(Fe-EDTA) as described for example in Zheng et al., 1999, Biochem.37:2207-2214; Latham & Cech, 1989, Science 245:276-282; and Sambrook etal., 2001, in Molecular Cloning: A Laboratory Manual, pp 12.61-12.73,Cold Spring Harbor Laboratory Press, and the references cited therein,which are hereby incorporated by reference in their entireties.

The specific pattern of cleavage sites is determined by theaccessibility of particular bases to the reagent employed to initiatecleavage and, as such, is therefore is determined by thethree-dimensional structure of the RNA. The interaction of smallmolecules with a target nucleic acid can change the accessibility ofbases to these cleavage reagents both by causing conformational changesin the target nucleic acid or by covering a base at the bindinginterface. When a compound binds to the nucleic acid and changes theaccessibility of bases to cleavage reagents, the observed cleavagepattern will change. This method can be used to identify andcharacterize the binding of small molecules to RNA as described, forexample, by Prudent et al., 1995, J. Am. Chem. Soc. 117:10145-10146 andMei et al., 1998, Biochem. 37:14204-14212.

In the preferred embodiment of this technique, the detectably labeledtarget nucleic acid is incubated with an individual compound and thensubjected to treatment with a cleavage reagent, either enzymatic orchemical. The reaction mixture can be preferably be examined directly,or treated further to isolate and concentrate the nucleic acid. Thefragments produced are separated by electrophoresis and the pattern ofcleavage can be compared to a cleavage reaction performed in the absenceof compound. A change in the cleavage pattern directly indicates thatthe compound binds to the target nucleic acid. Multiple compounds can beexamined both in parallel and serially.

Other embodiments of electrophoretic separation include, but are notlimited to urea gel electrophoresis, gel filtration, pulsed field gelelectrophoresis, two dimensional gel electrophoresis, continuous flowelectrophoresis, zone electrophoresis, and agarose gel electrophoresis.

5.6.2 Fluorescence Spectroscopy

In a preferred embodiment, fluorescence polarization spectroscopy, anoptical detection method that can differentiate the proportion of afluorescent molecule that is either bound or unbound in solution (e.g.,the labeled target nucleic acid of the present invention), can be usedto read reaction results without electrophoretic separation of thesamples. Fluorescence polarization spectroscopy can be used to read thereaction results in the chip system disclosed in U.S. Pat. Nos.5,699,157; 5,842,787; 5,869,004; 5,876,675; 5,942,443; 5,948,227;6,042,709; 6,042,710; 6,046,056; 6,048,498; 6,086,740; 6,132,685;6,150,119; 6,150,180; 6,153,073; 6,167,910; 6,171,850; and 6,186,660,the disclosures of which are incorporated by reference in theirentireties. The application of fluorescence polarization spectroscopy tothe chip system disclosed in the U.S. Patents listed supra is fast,efficient, and well-adapted for high-throughput screening.

In another embodiment, a compound that has an affinity for the targetnucleic acid of interest can be labeled with a fluorophore to screen forcompounds that bind to the target nucleic acid. For example, apyrene-containing aminoglycoside analog was used to accurately monitorantagonist binding to a prokaryotic 16S rRNA A site (which comprises thenatural target for aminoglycoside antibiotics) in a screen using afluorescence quenching technique in a 96-well plate format (Hamasaki &Rando, 1998, Anal. Biochem. 261(2):183-90).

In another embodiment, fluorescence resonance energy transfer (FRET) canbe used to screen for compounds that bind to the target nucleic acid.FRET, a characteristic change in fluorescence, occurs when twofluorophores with overlapping emission and excitation wavelength bandsare held together in close proximity, such as by a binding event. In thepreferred embodiment, the fluorophore on the target nucleic acid and thefluorophore on the compounds will have overlapping excitation andemission spectra such that one fluorophore (the donor) transfers itsemission energy to excite the other fluorophore (the acceptor). Theacceptor preferably emits light of a different wavelength upon relaxingto the ground state, or relaxes non-radiatively to quench fluorescence.FRET is very sensitive to the distance between the two fluorophores, andallows measurement of molecular distances less than 10 nm. For example,U.S. Pat. NO. 6,337,183 to Arenas et al., which is incorporated byreference in its entirety, describes a screen for compounds that bindRNA that uses FRET to measure the effect of compounds on the stabilityof a target RNA molecule where the target RNA is labeled with bothfluorescent acceptor and donor molecules and the distance between thetwo fluorophores as determined by FRET provides a measure of the foldedstructure of the RNA. Matsumoto et al. (2000, Bioorg. Med. Chem. Lett.10:1857-1861) describe a system where a peptide that binds to HIV-1 TARRNA is labeled on one end with a fluorescein fluorophore and atetramethylrhodamine on the other end. The conformational change of thepeptide upon binding to the RNA provided a FRET signal to screen forcompounds that bound to the TAR RNA.

In the preferred embodiment, both the target nucleic acid and a compoundthat has an affinity for the target nucleic acid of interest are labeledwith fluorophores with overlapping emission and excitation spectra(donor and acceptor), including but not limited to fluorescein andderivatives, rhodamine and derivatives, cyanine dyes and derivatives,bora-3a,4a-diaza-s-indacene (BODIPY®) and derivatives, pyrene,nanoparticles, or non-fluorescent quenching molecules. Binding of alabeled compound to the target nucleic acid can be identified by thechange in observable fluorescence as a result of FRET.

If the target nucleic acid is labeled with the donor fluorophore, thenthe compounds are labeled with the acceptor fluorophore. Conversely, ifthe target nucleic acid is labeled with the acceptor fluorophore, thenthe compounds are labeled with the donor fluorophore. A wide variety oflabels may be used, with the choice of label depending on sensitivityrequired, ease of conjugation with the compound, stability requirements,available instrumentation, and disposal provisions. The fluorophore onthe target nucleic acid must be in close proximity to the binding siteof the compounds, but should not be incorporated into a target nucleicacid at the specific binding site at which compounds are likely to bind,since the presence of a covalently attached label might interferesterically or chemically with the binding of the compounds at this site.

In yet another embodiment, homogeneous time-resolved fluorescence(“HTRF”) techniques based on time-resolved energy transfer fromlanthanide ion complexes to a suitable acceptor species can be adaptedfor high-throughput screening for inhibitors of RNA-protein complexes(Hemmilä, 1999, J. Biomol. Screening 4:303-307; Mathis, 1999, J. Biomol.Screening 4:309-313). HTRF is similar to fluorescence resonance energytransfer using conventional organic dye pairs, but has severaladvantages, such as increased sensitivity and efficiency, and backgroundelimination (Xavier et al., 2000, Trends Biotechnol. 18(8):349-356).

It is also contemplated that the target RNA may be labeled with afluorophore and the compounds in the library labeled with a quencher ofthe fluorophore, or alternatively, the target RNA labeled with aquencher of the fluorophore and the compounds in the library labeledwith a fluorophore, so that when a compound and target RNA bind, thefluorescent signal of the fluorophore is quenched.

Fluorescence spectroscopy has traditionally been used to characterizeDNA-protein and protein-protein interactions, but fluorescencespectroscopy has not been widely used to characterize RNA-proteininteractions because of an interfering absorption of RNA nucleotideswith the intrinsic tryptophan fluorescence of proteins (Xavier et al.,2000, Trends Biotechnol. 18(8):349-356.). However, fluorescencespectroscopy has been used in studying the single tryptophan residuewithin the arginine-rich RNA-binding domain of Rev protein and itsinteraction with the RRE in a time-resolved fluorescence study (Kwon &Carson, 1998, Anal. Biochem. 264:133-140). Thus, in this invention,fluorescence spectroscopy is less preferred if the compounds or peptidesor proteins possess intrinsic tryptophan fluorescence. However,fluorescence spectroscopy can be used for compounds that do not possessintrinsic fluorescence.

5.6.3 Surface Plasmon Resonance (“SPR”)

Surface plasmon resonance (SPR) can be used for determining kinetic rateconstants and equilibrium constants for macromolecular interactions byfollowing the association project in “real time” (Schuck, 1997, Annu.Rev. Biophys. Biomol. Struct. 26:541-566).

The principle of SPR is summarized by Xavier et al. (Trends Biotechnol.,2000, 18(8):349-356) as follows. Total internal reflection occurs at theboundary between two substances of different refractive index. Theincident light's electromagnetic field penetrates beyond the interfaceas an evanescent wave, which extends a few hundred nanometers beyond thesurface into the medium. Insertion of a thin gold foil at the interaceproduced SPR owing to the absorption of the energy from the evanescentwave by free electron clouds of the metal (plasmons). As a result ofthis absorbance, there is a drop in the intensity of the reflected lightat a particular angle of incidence. The evanescent wave profile dependsexquisitely on the refractive index of the medium it probes. Thus, theangle at which absorption occurs is very sensitive to the refractivechanges in the external medium. All proteins and nucleic acids are knownto change the refractive index of water by a similar amount per unitmass, irrespective of their amino acid or nucleotide composition (therefractive index change is different for proteins and nucleic acids).When the protein or nucleic acid content of the layer at the sensorchanges, the refractive index also changes. Typically, one member of acomplex is immobilized in a dextran layer and then the other member isintroduced into the solution, either in a flow cell (Biacore AB,Uppsala, Sweden) or a stirred cuvette (Affinity Sensors, Santa Fe, N.Mex.). It has been determined that there is a linear correlation betweenthe surface concentration of protein or nucleic acid and the shift inresonance angle, which can be used to quantitate kinetic rate constantsand/or the equilibrium constants.

In the present invention, the target RNA may be immobilized to thesensor surface through a streptavidin-biotin linkage, the linkage ofwhich is disclosed by Crouch et al. (Methods Mol. Biol., 1999,118:143-160). The RNA is biotinylated either during synthesis orpost-synthetically via the conversion of the 3′ terminal ribonucleosideof the RNA into a reactive free amino group or using a T7 polymeraseincorporated guanosine monophosphorothioate at the 5′ end. SPR has beenused to determine the stoichiometry and affinity of the interactionbetween the HIV Rev protein and the RRE (Van Ryk & Venkatesan, 1999, J.Biol. Chem. 274:17452-17463) and the aminoglycoside antibiotics with RREand a model RNA derived from the 16S ribosomal A site, respectively(Hendrix et al., 1997, J. Am. Chem. Soc. 119:3641-3648; Wong et al.,1998, Chem. Biol. 5:397-406).

In one embodiment of the present invention, the target nucleic acid canbe immobilized to a sensor surface (e.g., by a streptavidin-biotinlinkage) and SPR can be used to (a) determine whether the target RNAbinds a compound and (b) further characterize the binding of the targetnucleic acids of the present invention to a compound.

5.6.4 Mass Spectrometry

An automated method for analyzing mass spectrometer data which cananalyze complex mixtures containing many thousands of components and cancorrect for background noise, multiply charged peaks and atomic isotopepeaks is described in U.S. Pat. No. 6,147,344, which is herebyincorporated by reference in its entirety. The system disclosed in U.S.Pat. No. 6,147,344 is a method for analyzing mass spectrometer data inwhich a control sample measurement is performed providing a backgroundnoise check The peak height and width values at each m/z ratio as afunction of time are stored in a memory. A mass spectrometer operationon a material to be analyzed is performed and the peak height and widthvalues at each m/z ratio versus time are stored in a second memorylocation. The mass spectrometer operation on the material to be analyzedis repeated a fixed number of times and the stored control sample valuesat each m/z ratio level at each time increment are subtracted from eachcorresponding one from the operational runs, thus producing a differencevalue at each mass ratio for each of the multiple runs at each timeincrement. If the MS value minus the background noise does not exceed apreset value, the m/z ratio data point is not recorded, thus eliminatingbackground noise, chemical noise and false positive peaks from the massspectrometer data. The stored data for each of the multiple runs is thencompared to a predetermined value at each m/z ratio and the resultantseries of peaks, which are now determined to be above the background, isstored in the m/z points in which the peaks are of significance.

One possibility for the utilization of mass spectrometry in highthroughput screening is the integration of SPR with mass spectrometry.Approaches that have been tried are direct analysis of the analyteretained on the sensor chip and mass spectrometry with the elutedanalyte (Sonksen et al., 1998, Anal. Chem. 70:2731-2736; Nelson & Krone,1999, J. Mol. Recog. 12:77-93). Further developments, especially in theinterfacing of the sensor chip with the mass spectrometer and in reusingthe sensor chip, are required to make SPR combined with massspectroscopy a high-throughput method for biomolecular interactionanalysis and the screening of targets for small molecule inhibitors(Xavier et al., 2000, Trends Biotechnol. 18(8):349-356).

In one embodiment of the present invention, the target nucleic acidcomplexed to a compound can be determined by any of the massspectrometry processed described supra. Furthermore, mass spectrometrycan also be used to elucidate the structure of the compound.

5.6.5 Scintillation Proximity Assay (“SPA”)

Scintillation proximity assay (“SPA”) is a method that can be used forscreening small molecules that bind to the target RNAs. SPA wouldinvolve radiolabeling either the target RNA or the compound and thenquantitating its binding to the other member to a bead or a surfaceimpregnated with a scintillant (Cook, 1996, Drug Discov. Today1:287-294). Currently, fluorescence-based techniques are preferred forhigh-throughput screening (Pope et al., 1999, Drug Discov. Today4:350-362).

Screening for small molecules that inhibit Tat peptide:TAR RNAinteraction has been performed with SPA, and inhibitors of theinteraction were isolated and characterized (Mei et al., 1997, Bioorg.Med. Chem. 5:1173-1184; Mei et al., 1998, Biochemistry 37:14204-14212).A similar approach can be used to identify small molecules that directlybind to a preselected target RNA element in accordance with theinvention can be utilized.

SPA can be adapted to high throughput screening by the availability ofmicroplates, wherein the scintillant is directly incorporated into theplastic of the microtiter wells (Nakayama et al., 1998, J. Biomol.Screening 3:43-48). Thus, one embodiment of the present inventioncomprises (a) labeling of the target nucleic acid with a radioactive orfluorescent label; (b) contacted the labeled nucleic acid withcompounds, wherein each compound is in a microtiter well coated withscintillant and is tethered to the microtiter well; and (c) identifyingand quantifying the compounds bound to the target nucleic acid with SPA,wherein the compound is identified by virtue of its location in themicroplate.

5.6.6 Structure-Activity Relationships (“SAR”) by NMR Spectroscopy

NMR spectroscopy is a valuable technique for identifying complexedtarget nucleic acids by qualitatively determining changes in chemicalshift, specifically from distances measured using relaxation effects,and NMR-based approaches have been used in the identification of smallmolecule binders of protein drug targets (Xavier et al., 2000, TrendsBiotechnol. 18(8):349-356). The determination of structure-activityrelationships (“SAR”) by NMR is the first method for NMR described inwhich small molecules that bind adjacent subsites are identified bytwo-dimentional ¹H-¹⁵N spectra of the target protein (Shuker et al.,1996, Science 274:1531-1534). The signal from the bound molecule ismonitored by employing line broadening, transferred NOEs and pulsedfield gradient diffusion measurements (Moore, 1999, Curr. Opin.Biotechnol. 10:54-58). A strategy for lead generation by NMR using alibrary of small molecules has been recently described (Fejzo et al.,1999, Chem. Biol. 6:755-769).

In one embodiment of the present invention, the target nucleic acidcomplexed to a compound can be determined by SAR by NMR. Furthermore,SAR by NMR can also be used to elucidate the structure of the compound.

5.6.7 Size Exclusion Chromatography

In another embodiment of the present invention, size-exclusionchromatography is used to purify compounds that are bound to a targetnucleic from a complex mixture of compounds. Size-exclusionchromatography separates molecules based on their size and usesgel-based media comprised of beads with specific size distributions.When applied to a column, this media settles into a tightly packedmatrix and forms a complex array of pores. Separation is accomplished bythe inclusion or exclusion of molecules by these pores based onmolecular size. Small molecules are included into the pores and,consequently, their migration through the matrix is retarded due to theadded distance they must travel before elution. Large molecules areexcluded from the pores and migrate with the void volume when applied tothe matrix. In the present invention, a target nucleic acid is incubatedwith a mixture of compounds while free in solution and allowed to reachequilibrium. When applied to a size exclusion column, compounds free insolution are retained by the column, and compounds bound to the targetnucleic acid are passed through the column. In a preferred embodiment,spin columns commonly used for “desalting” of nucleic acids will beemployed to separate bound from unbound compounds (e.g., Bio-Spincolumns manufactured by BIO-RAD). In another embodiment, the sizeexclusion matrix is packed into multiwell plates to allow highthroughput separation of mixtures (e.g., PLASMID 96-well SEC platesmanufactured by Millipore).

5.6.8 Affinity Chromatography

In one embodiment of the present invention, affinity capture is used topurify compounds that are bound to a target nucleic acid labeled with anaffinity tag from a complex mixture of compounds. To accomplish this, atarget nucleic acid labeled with an affinity tag is incubated with amixture of compounds while free in solution and then captured to a solidsupport once equilibrium has been established; alternatively, targetnucleic acids labeled with an affinity tag can be captured to a solidsupport first and then allowed to reach equilibrium with a mixture ofcompounds.

The solid support is typically comprised of, but not limited to,cross-linked agarose beads that are coupled with a ligand for theaffinity tag. Alternatively, the solid support may be a glass, silicon,metal, or carbon, plastic (polystyrene, polypropylene) surface with orwithout a self-assembled monolayer (SAM) either with a covalentlyattached ligand for the affinity tag, or with inherent affinity for thetag on the target nucleic acid.

Once the complex between the target nucleic acid and compound hasreached equilibrium and has been captured, one skilled in the art willappreciate that the retention of bound compounds and removal of unboundcompounds is facilitated by washing the solid support with largeexcesses of binding reaction buffer. Furthermore, retention of highaffinity compounds and removal of low affinity compounds can beaccomplished by a number of means that increase the stringency ofwashing; these means include, but are not limited to, increasing thenumber and duration of washes, raising the salt concentration of thewash buffer, addition of detergent or surfactant to the wash buffer, andaddition of non-specific competitor to the wash buffer.

In one embodiment, the compounds themselves are detectably labeled withfluorescent dyes, radioactive isotopes, or nanoparticles. When thecompounds are applied to the captured target nucleic acid in a spatiallyaddressed fashion (e.g. in separate wells of a 96-well microplate),binding between the compounds and the target nucleic acid can bedetermined by the presence of the detectable label on the compound usingfluorescence.

Following the removal of unbound compounds, bound compounds with highaffinity for the target nucleic acid can be eluted from the immobilizedtarget nucleic acids and analyzed. The elution of compounds can beaccomplished by any means that break the non-covalent interactionsbetween the target nucleic acid and compound. Means for elution include,but are not limited to, changing the pH, changing the saltconcentration, the application of organic solvents, and the applicationof molecules that compete with the bound ligand. In a preferredembodiment, the means employed for elution will release the compoundfrom the target RNA, but will not effect the interaction between theaffinity tag and the solid support, thereby achieving selective elutionof compound. Moreover, a preferred embodiment will employ an elutionbuffer that is volatile to allow for subsequent concentration bylyophilization of the eluted compound (e.g., 0 M to 5 M ammoniumacetate).

In another embodiment of the invention, the target RNA can be labeledwith biotin, an antigen, or a ligand. Library beads complexed to thetarget RNA can be separated from uncomplexed beads using affinitytechniques designed to capture the labeled moiety on the target RNA. Forexample, a solid support, such as but not limited to, a column or a wellin a microwell plate coated with avidin/streptavidin, an antibody to theantigen, or a receptor for the ligand can be used to capture orimmobilize the labeled beads. Complexed RNA may or may not beirreversibly bound to the bead by a further transformation between thebound RNA and an additional moiety on the surface of the bead. Suchlinking methods include, but are not limited to: photochemicalcrosslinking between RNA and bead-bound molecules such as psoralen,thymidine or uridine derivates either present as monomers, oligomers, oras a partially complementary sequence; or chemical ligation by disulfideexchange, nitrogen mustards, bond formation between an electrophile anda nucleophile, or alkylating reagents. See, e.g., International PatentPublication WO/0146461, the contents of which are hereby incorporated byreference. The unbound library beads can be removed after the bindingreaction by washing the solid phase. If the RNA is irreversibly bound tothe bead, compounds can be isolated from the bead following destructionof the bound RNA by preferably, but not limited to, enzymatic orchemical (e.g., alkaline hydrolysis) degradation. The library beadsbound to the solid phase can then be eluted with any solution thatdisrupts the binding between the labeled target RNA and the solid phase.Such solutions include high salt solutions, low pH solutions,detergents, and chaotropic denaturants, and are well known to one ofskill in the art. In another embodiment, the compounds can be elutedfrom the solid phase by heat.

In one embodiment, the library of compounds can be prepared on magneticbeads, such as Dynabeads Streptavidin (Dynal Biotech, Oslo, Norway). Themagnetic bead library can then be mixed with the labeled target RNAunder conditions that allow binding to occur. The separation of thebeads from unbound target RNA in the liquid phase can be accomplishedusing a magnet. After removal of the magnetic field, the bead complexedto the labeled RNA may be separated from uncomplexed library beads viathe label used on the target RNA; e.g., biotinylated target RNA can becaptured by avidin/streptavidin; target RNA labeled with antigen can becaptured by the appropriate antibody; target RNA labeled with ligand canbe captured using the appropriate immobilized receptor. The capturedlibrary bead can then be eluted with any solution that disrupts thebinding between the labeled target RNA and the immobilized surface. Suchsolutions include high salt solutions, low pH solutions, detergents, andchaotropic denaturants, and are well known to one of skill in the art.Complexed RNA may or may not be irreversibly bound to the bead by afurther transformation between the bound RNA and an additional moiety onthe surface of the bead. Such linking methods include, but are notlimited to: photochemical crosslinking between RNA and bead-boundmolecules such as psoralen, thymidine or uridine derivates eitherpresent as monomers, oligomers, or as a partially complementarysequence; or chemical ligation by disulfide exchange, nitrogen mustards,bond formation between an electrophile and a nucleophile, or alkylatingreagents. See, e.g., International Patent Publication WO/0146461, thecontents of which are hereby incorporated by reference. If the RNA isirreversibly bound to the bead, compounds can be isolated from the beadfollowing destruction of the bound RNA by enzymatic degradationincluding, but not limited to, ribonucleases A, U₂, CL₃, T₁, Phy M, B.cereus or chemical degradation including, but not limited to,piperidine-promoted backbone cleavage of abasic sites (followingtreatment with sodium hydroxide, hydrazine, piperidine formate, ordimethyl sulfate), or metal-assisted (e.g. nickel(II), cobalt(II), oriron(II)) oxidative cleavage.

In another embodiment, the preselected target RNA can be labeled with aheavy metal tag and incubated with the library beads to allow binding ofthe compounds to the target RNA. The separation of the labeled beadsfrom unlabeled beads can be accomplished using a magnetic field. Afterremoval of the magnetic field, the compound can be eluted with anysolution that disrupts the binding between the preselected target RNAand the compound. Such solutions include high salt solutions, low pHsolutions, detergents, and chaotropic denaturants, and are well known toone of skill in the art. In another embodiment, the compounds can beeluted from the solid phase by heat.

5.6.9 Nanoparticle Aggregation

In one embodiment of the present invention, both the target nucleic acidand the compounds are labeled with nanoparticles. A nanoparticle is acluster of ions with controlled size from 0.1 to 1000 nm comprised ofmetals, metal oxides, or semiconductors including, but not limited toAg₂S, ZnS, CdS, CdTe, Au, or TiO₂. Methods for the attachment of nucleicacids and small molecules to nanoparticles are well know to one of skillin the art (reviewed in Niemeyer, 2001, Angew. Chem. Int. Ed.40:4129-4158. The references cited therein are hereby incorporated byreference in their entireties). In particular, if multiple copies of thetarget nucleic acid are attached to a single nanoparticle and multiplecopies of a compound are attached to another nanoparticle, theninteraction between the compound and target nucleic acid will induceaggregation of nanoparticles as described, for example, by Mitchel etal. 1999, J. Am. Chem. Soc. 121:8122-8123. The aggregate can be detectedby changes in absorbance or fluorescence spectra and physicallyseparated from the unbound components through filtration orcentrifugation.

5.6.10 Flow Cytometry

In a preferred embodiment, the complexed and non-complexed targetnucleic acids are separated by flow cytometry methods. Flow cytometersfor sorting and examining biological cells are well known in the art;this technology can be applied to separate the labeled library beadsfrom unlabeled beads. Known flow cytometers are described, for example,in U.S. Pat. Nos. 4,347,935; 5,464,581; 5,483,469; 5,602,039; 5,643,796;and 6,211,477; the entire contents of which are incorporated byreference herein. Other known flow cytometers are the FACS Vantage™system manufactured by Becton Dickinson and Company, and the COPAS™system manufactured by Union Biometrica.

A flow cytometer typically includes a sample reservoir for receiving abiological sample. The biological sample contains particles (hereinafterreferred to as “beads”) that are to be analyzed and sorted by the flowcytometer. Beads are transported from the sample reservoir at high speed(>100 beads/second) to a flow cell in a stream of liquid “sheath” fluid.High-frequency vibrations of a nozzle that directs the stream to theflow cell causes the stream to partition and form ordered droplets, witheach droplet containing a single bead. Physical properties of beads canbe measured as they intersect a laser beam within the cytometer flowcell. As beads move one by one through the interrogation point, theycause the laser light to scatter and fluorescent molecules on thelabeled beads (i.e., beads complexed with labeled target RNA) becomeexcited. Alternatively, if the target nucleic acid is labeled with aninorganic nanoparticle, the beads complexed with bound target nucleicacid can be distinguished not only by unique fluorescent properties butalso on the basis of spectrometric properties (e.g. including but notlimited to increased optical density due to the reduction of Ag⁺ ions inthe presence of gold nanoparticles (see, e.g., Taton et al. Science2000, 289: 1757-1760)).

5.6.11 Manual Batch

In one embodiment, a for separating complexed beads. To explore abead-based library within a reasonable time period, the primary screensshould be operated with sufficient throughput. To do this, the targetnucleic acid is labeled with a dye and then incubated with thecombinatorial library. An advantage of such an assay is the fastidentification of active library beads by color change. In the lowerconcentrations of the dye-labeled target molecule, only those librarybeads that bind the target molecules most tightly are detected becauseof higher local concentration of the dye. When washed and plated into aliquid monolayer, colored beads are easily separated from non-coloredbeads with the aid of a dissecting microscope. One of the problemsassociated with this method could be the interaction between the red dyeand library substrates. Control experiments using the dye alone and dyeattached to mutant RNA sequences with the libraries are performed toeliminate this possibility.

5.6.12 Suspension of Beads in Electric Fields

In another embodiment of the invention, library beads bound to thetarget RNA can be separated from unbound beads on the basis of thealtered charge properties due to RNA binding. In a preferred embodimentof this technique, beads are separated from unbound nucleic acid andsuspended, preferably but not only, in the presence of an electric fieldwhere the bound RNA causes the beads bound to the target RNA to migratetoward the anode, or positive, end of the field.

Beads can be preferentially suspended in solution as a colloidalsuspension with the aid of detergents or surfactants. Typical detergentsuseful in the methods of the present invention include, but are notlimited to, anionic detergents, such as salts of deoxycholic acid,1-heptanesulfonic acid, N-laurylsarcosine, lauryl sulfate, 1-octanesulfonic acid, carboxymethylcellulose, carrageenan, and taurocholicacid; cationic detergents such as benzalkonium chloride,cetylpyridinium, methylbenzethonium chloride, and decamethonium bromide;zwitterionic detergents such as CHAPS, CHAPSO, alkyl betaines, alkyamidoalkyl betaines,N-dodecyl-N,N-dimethyl-3-ammonio-1-propanesulfonate, andphosphatidylcholine; and non-ionic detergents such as n-decyla-D-glucopyranoside, n-decyl-D-maltopyranoside, n-dodecyl -D-maltoside,n-octyl -D-glucopyranoside, sorbitan esters, n-tetradecyl-D-maltosideand tritons. Preferably, the detergent, if present, is a nonionicdetergent. Typical surfactants useful in the methods of the presentinvention include, but are not limited to, ammonium lauryl sulfate,polyethylene glycols, butyl glucoside, decyl glucoside, Polysorbate 80,lauric acid, myristic acid, palmitic acid, potassium palmitate,undecanoic acid, lauryl betaine, and lauryl alcohol.

Complexed RNA may or may not be irreversibly bound to the bead by afurther transformation between the bound RNA and an additional moiety onthe surface of the bead. Such linking methods include, but are notlimited to: photochemical crosslinking between RNA and bead-boundmolecules such as psoralen, thymidine or uridine derivates eitherpresent as monomers, oligomers, or as a partially complementarysequence; or chemical ligation by disulfide exchange, nitrogen mustards,bond formation between an electrophile and a nucleophile, or alkylatingreagents.

If the RNA is irreversibly bound to the bead, compounds can be isolatedfrom the bead following destruction of the bound RNA by enzymaticdegradation including, but not limited to, ribonucleases A, U₂, CL₃, T₁,Phy M, B. cereus or chemical degradation including, but not limited to,piperidine-promoted backbone cleavage of abasic sites (followingtreatment with sodium hydroxide, hydrazine, piperidine formate, ordimethyl sulfate), or metal-assisted (e.g. nickel(II), cobalt(II), oriron(II)) oxidative cleavage.

5.6.13 Microwave Spectroscopy

In another embodiment, the complexed beads are separated fromuncomplexed beads by microwave spectroscopy. For example, as describedin U.S. Pat. Nos. 6,395,480; 6,376,258; 6,368,795; 6,340,568; 6,338,968;6,287,874; and 6,287,776 to Hefti, the disclosures of which are herebyincorporated by reference, the unique dielectric properties of moleculesand binding complexes, such as hybridization complexes formed between anucleic acid probe and a nucleic acid target, molecular binding events,and protein/ligand complexes, result in varying microwave spectra whichcan be measured. The molecule's dielectric properties can be observed bycoupling a test signal to the molecule and observing the resultingsignal. When the test signal excites the molecule at a frequency withinthe molecule's dispersion regime, especially at a resonant frequency,the molecule will interact strongly with the signal, and the resultingsignal will exhibit dramatic variations in its measured amplitude andphase, thereby generating a unique signal response. This response can beused to detect and identify the bound molecular structure. In addition,because most molecules will exhibit different dispersion properties overthe same or different frequency bands, each generates a unique signalresponse which can be used to identify the molecular structure.

5.7 Methods for Identifying or Characterizing the Compounds Bound to theTarget Nucleic Acids

If the library comprises arrays or microarrays of compounds, whereineach compound has an address or identifier, the compound can bedeconvoluted, e.g., by cross-referencing the positive sample to originalcompound list that was applied to the individual test assays.

If the library is a peptide or nucleic acid library, the sequence of thecompound can be determined by direct sequencing of the peptide ornucleic acid. Such methods are well known to one of skill in the art.

A number of physico-chemical techniques can be used for the de novocharacterization of compounds bound to the target.

5.7.1 Mass Spectrometry

Mass spectrometry (e.g., electrospray ionization (“ESI”) andmatrix-assisted laser desorption-ionization (“MALDI”), Fourier-transformion cyclotron resonance (“FT-ICR”) an be used both for high-throughputscreening of compounds that bind to a target RNA and elucidating thestructure of the compound. Thus, one example of mass spectroscopy isthat separation of a bound and unbound complex and compound structureelucidation can be carried out in a single step.

MALDI uses a pulsed laser for desorption of the ions and atime-of-flight analyzer, and has been used for the detection ofnoncovalent tRNA:amino-acyl-tRNA synthetase complexes (Gruic-Sovulj etal., 1997, J. Biol. Chem. 272:32084-32091). However, covalentcross-linking between the target nucleic acid and the compound isrequired for detection, since a non-covalently bound complex maydissociate during the MALDI process.

ESI mass spectrometry (“ESI-MS”) has been of greater utility forstudying non-covalent molecular interactions because, unlike the MALDIprocess, ESI-MS generates molecular ions with little to no fragmentation(Xavier et al., 2000, Trends Biotechnol. 18(8):349-356). ESI-MS has beenused to study the complexes formed by HIV Tat peptide and protein withthe TAR RNA (Sannes-Lowery et al., 1997, Anal. Chem. 69:5130-5135).

Fourier-transform ion cyclotron resonance (“FT-ICR”) mass spectrometryprovides high-resolution spectra, isotope-resolved precursor ionselection, and accurate mass assignments (Xavier et al., 2000, TrendsBiotechnol. 18(8):349-356). FT-ICR has been used to study theinteraction of aminoglycoside antibiotics with cognate and non-cognateRNAs (Hofstadler et al., 1999, Anal. Chem. 71:3436-3440; Griffey et al.,1999, Proc. Natl. Acad. Sci. USA 96:10129-10133). As true for all of themass spectrometry methods discussed herein, FT-ICR does not requirelabeling of the target RNA or a compound.

An advantage of mass spectroscopy is not only the elucidation of thestructure of the compound, but also the determination of the structureof the compound bound to the preselected target RNA. Such informationcan enable the discovery of a consensus structure of a compound thatspecifically binds to a preselected target RNA.

In a specific embodiment, the structure of the compound is determined bytime of flight mass spectroscopy (“TOF-MS”). In time of flight methodsof mass spectrometry, charged (ionized) molecules are produced in avacuum and accelerated by an electric field into a time of flight tubeor drift tube. The velocity to which the molecules may be accelerated isproportional to the accelerating potential, proportional to the chargeof the molecule, and inversely proportional to the square of the mass ofthe molecule. The charged molecules travel, i.e., “drift” down the TOFtube to a detector. The time taken for the molecules to travel down thetube may be interpreted as a measure of their molecular weight.Time-of-flight mass spectrometers have been developed for all of themajor ionization techniques such as, but limited to, electron impact(“EI”), infrared laser desorption (“IRLD”), plasma desorption (“PD”),fast atom bombardment (“FAB”), secondary ion mass spectrometry (“SIMS”),matrix-assisted laser desorption/ionization (“MALDI”), and electrosprayionization (“ESI”).

5.7.2 Edman Degradation

In an embodiment wherein the library is a peptide library or aderivative thereof, Edman degradation can be used to determine thestructure of the compound. In one embodiment, a modified Edmandegradation process is used to obtain compositional tags for proteins,which is described in U.S. Pat. No. 6,277,644 to Farnsworth et al.,which is hereby incorporated by reference in its entirety. The Edmandegradation chemistry is separated from amino acid analysis,circumventing the serial requirement of the conventional Edman process.Multiple cycles of coupling and cleavage are performed prior toextraction and compositional analysis of amino acids. The amino acidcomposition information is then used to search a database of knownprotein or DNA sequences to identify the sample protein. An apparatusfor performing this method comprises a sample holder for holding thesample, a coupling agent supplier for supplying at least one couplingagent, a cleavage agent supplier for supplying a cleavage agent, acontroller for directing the sequential supply of the coupling agents,cleavage agents, and other reagents necessary for performing themodified Edman degradation reactions, and an analyzer for analyzingamino acids.

In another embodiment, the method can be automated as described in U.S.Pat. No. 5,565,171 to Dovichi et al., which is hereby incorporated byreference in its entirety. The apparatus includes a continuous capillaryconnected between two valves that control fluid flow in the capillary.One part of the capillary forms a reaction chamber where the sample maybe immobilized for subsequent reaction with reagents supplied throughthe valves. Another part of the capillary passes through or terminatesin the detector portion of an analyzer such as an electrophoresisapparatus, liquid chromatographic apparatus or mass spectrometer. Theapparatus may form a peptide or protein sequencer for carrying out theEdman degradation reaction and analyzing the reaction product producedby the reaction. The protein or peptide sequencer includes a reactionchamber for carrying out coupling and cleavage on a peptide or proteinto produce derivatized amino acid residue, a conversion chamber forcarrying out conversion and producing a converted amino acid residue andan analyzer for identifying the converted amino acid residue. Thereaction chamber may be contained within one arm of a capillary and theconversion chamber is located in another arm of the capillary. Anelectrophoresis length of capillary is directly capillary coupled to theconversion chamber to allow electrophoresis separation of the convertedamino acid residue as it leaves the conversion chamber. Identificationof the converted amino acid residue takes place at one end of theelectrophoresis length of the capillary.

5.7.3 NMR Spectroscopy

As described above, NMR spectroscopy is a technique for identifyingbinding sites in target nucleic acids by qualitatively determiningchanges in chemical shift, specifically from distances measured usingrelaxation effects. Examples of NMR that can be used for the inventioninclude, but are not limited to, one-dimentional NMR, two-dimentionalNMR, correlation spectroscopy (“COSY”), and nuclear Overhauser effect(“NOE”) spectroscopy. Such methods of structure determination ofcompounds are well known to one of skill in the art.

Similar to mass spectroscopy, an advantage of NMR is the not only theelucidation of the structure of the compound, but also the determinationof the structure of the compound bound to the preselected target RNA.Such information can enable the discovery of a consensus structure of acompound that specifically binds to a preselected target RNA.

5.7.4 Vibrational Spectroscopy

Vibrational spectroscopy (e.g. infrared (IR) spectroscopy or Ramanspectroscopy) can be used for elucidating the structure of the compoundon the isolated bead.

Infrared spectroscopy measures the frequencies of infrared light(wavelengths from 100 to 10,000 nm) absorbed by the compound as a resultof excitation of vibrational modes according to quantum mechanicalselection rules which require that absorption of light cause a change inthe electric dipole moment of the molecule. The infrared spectrum of anymolecule is a unique pattern of absorption wavelengths of varyingintensity that can be considered as a molecular fingerprint to identifyany compound.

Infrared spectra can be measured in a scanning mode by measuring theabsorption of individual frequencies of light, produced by a gratingwhich separates frequencies from a mixed-frequency infrared lightsource, by the compound relative to a standard intensity (double-beaminstrument) or pre-measured (‘blank’) intensity (single-beaminstrument). In a preferred embodiment, infrared spectra are measured ina pulsed mode (FT-IR) where a mixed beam, produced by an interferometer,of all infrared light frequencies is passed through or reflected off thecompound. The resulting interferogram, which may or may not be addedwith the resulting interferograms from subsequent pulses to increase thesignal strength while averaging random noise in the electronic signal,is mathematically transformed into a spectrum using Fourier Transform orFast Fourier Transform algorithms.

Raman spectroscopy measures the difference in frequency due toabsorption of infrared frequencies of scattered visible or ultravioletlight relative to the incident beam. The incident monochromatic lightbeam, usually a single laser frequency, is not truly absorbed by thecompound but interacts with the electric field transiently. Most of thelight scattered off the sample with be unchanged (Rayleigh scattering)but a portion of the scatter light will have frequencies that are thesum or difference of the incident and molecular vibrational frequencies.The selection rules for Raman (inelastic) scattering require a change inpolarizability of the molecule. While some vibrational transitions areobservable in both infrared and Raman spectrometry, must are observableonly with one or the other technique. The Raman spectrum of any moleculeis a unique pattern of absorption wavelengths of varying intensity thatcan be considered as a molecular fingerprint to identify any compound.

Raman spectra are measured by submitting monochromatic light to thesample, either passed through or preferably reflected off, filtering theRayleigh scattered light, and detecting the frequency of the Ramanscattered light. An improved Raman spectrometer is described in U.S.Pat. No. 5,786,893 to Fink et al., which is hereby incorporated byreference.

Vibrational microscopy can be measured in a spatially resolved fashionto address single beads by integration of a visible microscope andspectrometer. A microscopic infrared spectrometer is described in U.S.Pat. No. 5,581,085 to Reffner et al., which is hereby incorporated byreference in its entirety. An instrument that simultaneously performs amicroscopic infrared and microscopic Raman analysis on a sample isdescribed in U.S. Pat. No. 5,841,139 to Sostek et al., which is herebyincorporated by reference in its entirety.

In one embodiment of the method, compounds are synthesized onpolystyrene beads doped with chemically modified styrene monomers suchthat each resulting bead has a characteristic pattern of absorptionlines in the vibrational (IR or Raman) spectrum, by methods includingbut not limited to those described by Fenniri et al., 2000, J. Am. Chem.Soc. 123:8151-8152. Using methods of split-pool synthesis familiar toone of skill in the art, the library of compounds is prepared so thatthe spectroscopic pattern of the bead identifies one of the componentsof the compound on the bead. Beads that have been separated according totheir ability to bind target RNA can be identified by their vibrationalspectrum. In one embodiment of the method, appropriate sorting andbinning of the beads during synthesis then allows identification of oneor more further components of the compound on any one bead. In anotherembodiment of the method, partial identification of the compound on abead is possible through use of the spectroscopic pattern of the beadwith or without the aid of further sorting during synthesis, followed bypartial resynthesis of the possible compounds aided by doped beads andappropriate sorting during synthesis.

In another embodiment, the IR or Raman spectra of compounds are examinedwhile the compound is still on a bead, preferably, or after cleavagefrom bead, using methods including but not limited to photochemical,acid, or heat treatment. The compound can be identified by comparison ofthe IR or Raman spectral pattern to spectra previously acquired for eachcompound in the combinatorial library.

In a specific embodiment, compounds can be identified by matching the IRor Raman spectra of a compound to a dataset of vibrational (IR or Raman)spectra previously acquired for each compound in the combinatoriallibrary. By this method, the spectra of compounds with known structureare recorded so that comparison with these spectra can identifycompounds again when isolated from RNA binding experiments.

5.7.5 Microwave Spectroscopy

In another embodiment, the microwave spectra of a compound can be usedto elucidate the structure of the compound. For example, as described inU.S. Pat. Nos. 6,395,480; 6,376,258; 6,368,795; 6,340,568; 6,338,968;6,287,874; and 6,287,776 to Hefti, the disclosures of which are herebyincorporated by reference, the unique dielectric properties of moleculesand binding complexes, such as hybridization complexes formed between anucleic acid probe and a nucleic acid target, molecular binding events,and protein/ligand complexes, result in varying microwave spectra whichcan be measured. The molecule's dielectric properties can be observed bycoupling a test signal to the molecule and observing the resultingsignal. When the test signal excites the molecule at a frequency withinthe molecule's dispersion regime, especially at a resonant frequency,the molecule will interact strongly with the signal, and the resultingsignal will exhibit dramatic variations in its measured amplitude andphase, thereby generating a unique signal response. This response can beused to detect and identify the bound molecular structure. In addition,because most molecules will exhibit different dispersion properties overthe same or different frequency bands, each generates a unique signalresponse which can be used to identify the molecular structure.

5.7.6 X-Ray Crystallography

X-ray crystallography can be used to elucidate the structure of acompound. For a review of x-ray crystallography see, e.g., Blundell etal. 2002, Nat Rev Drug Discov 1(1):45-54. The first step in x-raycrystallography is the formation of crystals. The formation of crystalsbegins with the preparation of highly purified and soluble samples. Theconditions for crystallization is then determined by optimizing severalsolution variables known to induce nucleation, such as pH, ionicstrength, temperature, and specific concentrations of organic additives,salts and detergent. Techniques for automating the crystallizationprocess have been developed to automate the production of high-qualityprotein crystals. Once crystals have been formed, the crystals areharvested and prepared for data collection. The crystals are thenanalyzed by diffraction (such as multi-circle diffractometers,high-speed CCD detectors, and detector off-set). Generally, multiplecrystals must be screened for structure determinations.

A number of methods can be used to acquire a diffraction patter so thata compound can be characterized. In one embodiment, an X-ray source isprovided, for example, by a rotating anode generator producing an X-raybeam of a characteristic wavelength. There are a number of sources ofX-ray radiation that may be used in the methods of the invention,including low and high intensity radiation. In one example, the tunableX-ray radiation is produced by a Synchrotron. In another embodiment, theprimary X-ray beam is monochromated by either crystal monochromators orfocusing mirrors and the beam is passed through a helium flushedcollimator. In a preferred embodiment, the crystal is mounted on a pinon a goniometer head, that is mounted to a goniometer which allows toposition the crystal in different orientations in the beam. Thediffracted X-rays can be recorded using a number of techniques,including, but not limited to image plates, multiwire detectors or CCDcameras. In other embodiments, flash cooling, for example, of proteincrystals, to cryogenic temperatures (˜100 K) offers many advantages, themost significant of which is the elimination of radiation damage.

5.8 Naturally Occurring Genes with Premature Stop Codons: Examples ofDisorders and Diseases

The invention provides for naturally occurring genes with premature stopcodons to ascertain the effects of compounds on premature translationtermination and/or nonsense-mediated mRNA decay. In general, theexpression of the gene product, in particular, a full-length geneproduct, is indicative of the effect of the compounds on prematuretranslation termination and/or nonsense-mediated mRNA decay.

In a preferred embodiment, the naturally occurring genes with prematurestop codons are genes that cause diseases which are due, in part, to thelack of expression of the gene resulting from the premature stop codon.Such diseases include, but are not limited to, cystic fibrosis, musculardystrophy, heart disease (e.g., familial hypercholesterolemia),p53-associated cancers (e.g., lung, breast, colon, pancreatic,non-Hodgkin's lymphoma, ovarian, and esophageal cancer), colorectalcarcinomas, neurofibromatosis, retinoblastoma, Wilm's tumor, retinitispigmentosa, collagen disorders (e.g., osteogenesis imperfecta andcirrhosis), Tay Sachs disease, blood disorders (e.g., hemophilia, vonWillebrand disease, b-Thalassemia), kidney stones,ataxia-telangiectasia, lysosomal storage diseases, and tuberoussclerosis. Genes involved in the etiology of these diseases arediscussed below.

The recognition of translation termination signals is not necessarilylimited to a simple trinucleotide stop codon, but is instead recognizedby the sequences surrounding the stop codon in addition to the stopcodon itself (see, e.g., Manuvakhova et al., 2000, RNA 6(7):1044-1055,which is hereby incorporated by reference in its entirety). Thus, anygenes containing particular tetranucleotide sequences at the stop codon,such as, but not limited to, UGAC, UAGU, UAGC, UAGG, UAAC, UAAU, UAAG,and UAAA, are candidates of naturally occurring genes with prematurestop codons that are useful in the present invention. Human diseasegenes that contain these particular sequence motifs are sorted bychromosome is presented as an Example in Section 8.

5.8.1 Cystic Fibrosis

Cystic fibrosis is caused by mutations in the cystic fibrosisconductance regulator (“CFTR”) gene. Such mutations vary betweenpopulations and depend on a multitude of factors such as, but notlimited to, ethnic background and geographic location. Nonsensemutations in the CFTR gene are expected to produce little or not CFTRchloride channels. Several nonsense mutations in the CFTR gene have beenidentified (see, e.g., Tzetis et al., 2001, Hum Genet. 109(6):592-601.Strandvik et al., 2001, Genet Test. 5(3):235-42; Feldmann et al., 2001,Hum Mutat. 17(4):356; Wilschanski et al., 2000, Am J Respir Crit CareMed. 161(3 Pt 1):860-5; Castaldo et al., 1999, Hum Mutat. 14(3):272;Mittre et al., 1999, Hum Mutat. 14(2):182; Mickle et al., 1998, Hum MolGenet. 7(4):729-35; Casals et al., 1997, Hum Genet. 101(3):365-70;Mittre et al., 1996, Hum Mutat. 8(4):392-3; Bonizzato et al., 1995, HumGenet. April 1995 ;95(4):397402; Greil et al., 1995, Wien KlinWochenschr. 107(15):464-9; Zielenski et al., 1995, Hum Mutat. 5(1):43-7;Dork et al., 1994, Hum Genet. 94(5):533-42; Balassopoulou et al., 1994,Hum Mol Genet. 3(10):1887-8; Ghanem et al., 1994, 21(2):434-6; Will etal., J Clin Invest. April 1994 ;93(4):1852-9; Hull et al., 1994,Genomics. 1994 Jan 15;19(2):362-4; Dork et al., Hum Genet. 93(1):67-73;Rolfini & Cabrini, 1993, J Clin Invest. 92(6):2683-7; Will et al., 1993,J Med Genet. 30(10):833-7; Bienvenu et al., 1993, J Med Genet.30(7):621-2; Cheadle et al., 1993, Hum Mol Genet. 2(7):1067-8; Casals etal., 1993, Hum Genet. 91(1):66-70; Reiss et al., 1993, Hum Genet.91(1):78-9; Chevalier-Porst et al., 1992, Mol Genet. 1(8)-647-8; Hamoshet al., 1992, Hum Mol Genet. 1(7):542-4; Gasparini et al., 1992, J MedGenet. 29(8):558-62; Fanen et al., 1992, Genorimcs. 13(3):770-6; Joneset al., 1992, Hum Mol Genet. 1(1):11-7; Ronchetto et al., 1992,Genomics. 12(2):417-8.; Macek et al., 1992, Hum Mutat. 1(6):501-2;Shoshani et al., 1992, Am J Hum Genet. 50(1):222-8; Schloesser et al.,1991, J Med Genet. 28(12):878-80; Hamosh et al., 1991, J Clin Invest.88(6):1880-5; Bal et al., 1991, J Med Genet. 28(10):715-7; Dork et al.,1991, Hum Genet. 87(4):441-6; Beaudet et al., 1991, Am J Hum Genet.48(6):1213; Gasparini et al., 1991, Genomics. 10(1):193-200; Cutting etal., 1990, N Engl J Med. 1990, 323(24):1685-9; and Kerem et al., 1990,Proc Natl Acad Sci USA. 87(21):8447-51, the disclosures of which arehereby incorporated by reference in their entireties). Any CFTR geneencoding a premature translation codon including, but not limited to,the nonsense mutations described in the references cited above, can beused in the present invention to identify compounds that mediatepremature translation termination and/or nonsense-mediated mRNA decay.

5.8.2 Muscular Dystrophy

Muscular dystrophy is a genetic disease characterized by severe,progressive muscle wasting and weakness. Duchenne muscular dystrophy andBecker muscular dystrophy are generally caused by nonsense mutations ofthe dystrophin gene (see, e.g., Kerr et al., 2001, Hum Genet.109(4):402-7 and Wagner et al., 2001, Ann Neurol. 49(6):706-11).Nonsense mutations in other genes have also been implicated in othertypes of muscular dystrophy, such as, but not limited to, collagen genesin Ullrich congenital muscular dystrophy (see, e.g., Demir et al., 2002,Am J Hum Genet. 70(6):1446-58), the emerin gene and lamins genes inEmery-Dreifuss muscular dystrophy (see, e.g., Holt et al., 2001, BiochemBiophys Res Commun. 287(5):1129-33; Becane et al., 2000, Pacing ClinElectrophysiol. 23(11 Pt 1): 1661-6; and Bonne et al., 2000, Ann Neurol.48(2):170-80.), the dysferlin gene in Miyoshi myopathy (see, e.g.,Nakagawa et al., 2001, J Neurol Sci. 184(1):15-9), the plectin gene inlate onset muscular dystrophy (see, e.g., Bauer et al., 2001, Am JPathol. 158(2):617-25), the delta-sarcoglycan gene in recessivelimb-girdle muscular dystrophy (see, e.g., Duggan et al., 1997,Neurogenetics. 1(1):49-58), the laminina2-chain gene in congenitalmuscular dystrophy (see, e.g., Mendell et al., 1998, Hum Mutat.12(2):135), the plectin gene in late-onset muscular dystrophy (see,e.g., Rouan et al., 2000, J Invest Dermatol. 114(2):381-7 and Kunz etal., 2000, J Invest Dermatol. 114(2):376-80), the myophosphorylase genein McArdle's disease (see, e.g., Bruno et al., 1999, Neuromuscul Disord.9(1):347), and the collagen VI in Bethlem myopathy (see, e.g., Lamandeet al., 1998, Hum Mol Genet. June 1998 ;7(6):981-9).

Several nonsense mutations in the dystrophin gene have been identified(see, e.g., Kerr et al., 2001, Hum Genet. 109(4):402-7; Mendell et al.,2001, Neurology 57(4):645-50; Fajkusova et al., 2001, NeuromusculDisord. 11(2):133-8; Ginjaar et al., 2000, Eur J Hum Genet. 8(10):793-6;Lu et al., 2000, J Cell Biol. 148(5):985-96; Tuffery-Giraud et al.,1999, Hum Mutat. 14(5):359-68; Fajkusova et al., 1998, J Neurogenet.12(3):183-9; Tuffery et al., 1998, Hum Genet. 102(3):334-42; Shiga etal., 1997, J Clin Invest. 100(9):2204-10; Winnard et al., 1995, Am J HumGenet. 56(1):158-66; Prior et al., 1994, Am J Med Genet. 50(1):68-73;Prior et al., 1993, Hum Mol Genet. 2(3):311-3; Prior et al., 1993, HumMutat. 2(3):192-5; Nigro et al., 1992, Hum Mol Genet. 1(7):517-20;Worton, 1992, J Inherit Metab Dis. 15(4):539-50; and Bulman et al.,1991, Genomics. 10(2):457-60; the disclosures of which are herebyincorporated by reference in their entireties). Any gene encoding apremature translation codon implicated in muscular dystrophy including,but not limited to, the nonsense mutations described in the referencescited above, can be used in the present invention to identify compoundsthat mediate premature translation termination and/or nonsense-mediatedmRNA decay.

5.8.3 Familial Hypercholesterolemia

Hypercholesterolemia, or high blood cholesterol, results from either theoverproduction or the underutilization of low density lipoprotein(“LDL”). Hypercholesterolemia is caused by either the genetic diseasefamilial hypercholesterolemia or the consumption of a high cholesteroldiet. Nonsense mutations in the LDL receptor gene have been implicatedin familial hypercholesterolemia Several nonsense mutations in the LDLreceptor gene have been identified (see, e.g., Lind et al., 2002,Atherosclerosis 163(2):399-407; Salazar et al., 2002, Hum Mutat.19(4):462-3; Kuhrova et al., 2000, Hum Mutat. 19(1):80; Zakharova etal., 2001, Bioorg Khim. 27(5):393-6; Kuhrova et al., 2001, Hum Mutat.18(3):253; Genschel et al., 2001, Hum Mutat. 17(4):354; Weiss et al.,2000, J Inherit Metab Dis. 23(8):778-90; Mozas et al., 2000, Hum Mutat.15(5):483-4; Shin et al., 2000, Clin Genet. 57(3):225-9; Graham et al.,1999, Atherosclerosis 147(2):309-16; Hattori et al., 1999, Hum Mutat.14(1):87; Cenarro et al., 1998, Hum Mutat. 11(5):413; Rodningen et al.,1999, Hum Mutat. 13(3):186-96; Hirayama et al., 1998, J Hum Genet.43(4):250-4; Lind et al., 1998, J Intern Med. 244(1):19-25; Thiart etal., 1997, Mol Cell Probes 11(6):457-8; Maruyama et al., 1995,Arterioscler Thromb Vasc Biol. 15(10):1713-8; Koivisto et al., 1995, AmJ Hum Genet. 57(4):789-97; Lombardi et al., 1995, J Lipid Res.36(4):860-7; Leren et al., 1993, Hum Genet. 92(1):6-10; Landsberger etal., 1992, Am J Hum Genet. 50(2):427-33; Loux et al., 1992, Hum Mutat.1992;1(4):325-32; Motulsky, 1989, Arteriosclerosis. 9(1 Suppl):I3-7;Lehrman et al., 1987, J Biol Chem. 262(1):401-10; and Lehrman et al.,1985, Cell 41(3):735-43; the disclosures of which are herebyincorporated by reference in their entireties). Any LDL receptor geneencoding a premature translation codon including, but not limited to,the nonsense mutations described in the references cited above, can beused in the present invention to identify compounds that mediatepremature translation termination and/or nonsense-mediated mRNA decay.

5.8.4 p53-Associated Cancers

Mutant forms of the p53 protein, which is thought to act as a negativeregulator of cell proliferation, transformation, and tumorigenesis, havebeen implicated as a common genetic change characteristic of humancancer (see, e.g., Levine et al., 1991, Nature 351:453-456 and Hollsteinet al., 1991, Science 253:49-53). p53 mutations have been implicated incancers such as, but not limited to, lung cancer, breast cancer, coloncancer, pancreatic cancer, non-Hodgkin's lymphoma, ovarian cancer, andesophageal cancer.

Nonsense mutations have been identified in the p53 gene and have beenimplicated in cancer. Several nonsense mutations in the p53 gene havebeen identified (see, e.g., Masuda et al., 2000, Tokai J Exp Clin Med.25(2):69-77; Oh et al., 2000, Mol Cells 10(3):275-80; Li et al., 2000,Lab Invest. 80(4):493-9; Yang et al., 1999, Zhonghua Zhong Liu Za Zhi21(2):114-8; Finkelstein et al., 1998, Mol Diagn. 3(1):37-41; Kajiyamaet al., 1998, Dis Esophagus. 11(4):279-83; Kawamura et al., 1999, LeukRes. 23(2):115-26; Radig et al., 1998, Hum Pathol. 29(11):1310-6;Schuyer et al., 1998, Int J Cancer 76(3):299-303; Wang-Gohrke et al.,1998, Oncol Rep. 5(1):65-8; Fulop et al., 1998, J Reprod Med.43(2):119-27; Ninomiya et al., 1997, J Dermatol Sci. 14(3):173-8; Hsiehet al., 1996, Cancer Lett. 100(1-2):107-13; Rall et al., 1996, Pancreas.12(1):10-7; Fukutomi et al., 1995, Nippon Rinsho. 53(11):2764-8;Frebourg et al., 1995, Am J Hum Genet. 56(3):608-15; Dove et al., 1995,Cancer Surv. 25:335-55; Adamson et al., 1995, Br J Haematol. 89(1):61-6;Grayson et al., 1994, Am J Pediatr Hematol Oncol. 16(4):341-7; Lepelleyet al., 1994, Leukemia. 8(8):1342-9; McIntyre et al., 1994, J ClinOncol. 12(5):925-30; Horio et al., 1994, Oncogene. 9(4):1231-5; Nakamuraet al., 1992, Jpn J Cancer Res. 83(12):1293-8; Davidoff et al., 1992,Oncogene. 7(1):127-33; and Ishioka et al., 1991, Biochem Biophys ResCommun. 177(3):901-6; the disclosures of which are hereby incorporatedby reference in their entireties). Any p53 gene encoding a prematuretranslation codon including, but not limited to, the nonsense mutationsdescribed in the references cited above, can be used in the presentinvention to identify compounds that mediate premature translationtermination and/or nonsense-mediated mRNA decay.

5.8.5 Colorectal Carcinomas

Molecular genetic abnormalities resulting in colorectal carcinomainvolve tumor-suppressor genes that undergo inactivation (such as, butnot limited to, apc, mcc, dcc, p53, and possibly genes on chromosomes8p, 1p, and 22q) and dominant-acting oncogenes (such, but not limitedto, ras, src, and myc) (see, e.g., Hamilton, 1992, Cancer 70(5Suppl):1216-21). Nonsense mutations in the adenomatous polyposis coli(“APC”) gene and mismatch repair genes (such as, but not limited to,mlh1 and msh-2) have also been described. Nonsense mutations have beenimplicated in colorectal carcinomas (see, e.g., Viel et al., 1997, GenesChromosomes Cancer. 18(1):8-18; Akiyama et al., 1996, Cancer78(12):2478-84; Itoh & Imai, 1996, Hokkaido Igaku Zasshi 71(1):9-14;Kolodner et al., 1994, Genomics. 24(3):516-26; Ohue et al., 1994, CancerRes. 54(17):4798-804; and Yin et al., 1993, Gastroenterology.104(6):1633-9; the disclosures of which are hereby incorporated byreference in their entireties). Any gene encoding a prematuretranslation codon implicated in colorectal carcinoma including, but notlimited to, the nonsense mutations described in the references citedabove, can be used in the present invention to identify compounds thatmediate premature translation termination and/or nonsense-mediated mRNAdecay.

5.8.6 Neurofibromatosis

Neurofibromatosis is an inherited disorder, which is commonly causedcaused by mutations in the NF1 and NF2 tumor suppressor genes. It ischaracterized by multiple intracranial tumors including schwannomas,meningiomas, and ependymomas. Nonsense mutations in the NF1 and NF2genes have been described. Nonsense mutations have been implicated inneurofibromatosis (see, e.g., Lamszus et al., 2001, Int J Cancer91(6):803-8; Sestini et al., 2000, Hum Genet. 107(4):366-71; Fukasawa etal., 2000, J Cancer Res. 91(12):1241-9; Park et al., 2000, J Hum Genet.45(2):84-5; Ueki et al., 1999, Cancer Res. 59(23):5995-8;, 1999,Hokkaido Igaku Zasshi. 74(5):377-86; Buske et al., 1999, Am J Med Genet.86(4):328-30; Harada et al., 1999, Surg Neurol. 51(5):528-35; Krkljus etal., 1998, Hum Mutat. 11(5):411; Klose et al., 1999, Am J Med Genet.83(1):6-12; Park & Pivnick, 1998, J Med Genet. 35(10):813-20; Bahuau etal., 1998, Am J Med Genet. 75(3):265-72; Bijlsma et al., 1997, J MedGenet. 34(11):934-6; MacCollin et al., 1996, Ann Neurol. 40(3):440-5;Upadhyaya et al., 1996, Am J Med Genet. 67(4):421-3; Robinson et al.,1995, Hum Genet. 96(1):95-8.; Legius et al., 1995, J Med Genet.32(4):316-9; von Deimling et al., 1995, Brain Pathol. 5(1):11-4; Dublinet al., 1995, Hum Mutat. 5(1):81-5; Legius et al., 1994, GenesChromosomes Cancer. 10(4):250-5; Purandare et al., 1994, Hum Mol Genet.3(7):1109-15; Shen & Upadhyaya, 1993, Hum Genet. 92(4):410-2; andEstivill et al., 1991, Hum Genet. 88(2):185-8; the disclosures of whichare hereby incorporated by reference in their entireties). Any geneencoding a premature translation codon implicated in neurofibromatosisincluding, but not limited to, the nonsense mutations described in thereferences cited above, can be used in the present invention to identifycompounds that mediate premature translation termination and/ornonsense-mediated mRNA decay.

5.8.7 Retinoblastoma

The retinoblastoma gene plays important roles in the genesis of humancancers. Several pieces of evidence have shown that the retinoblastomaprotein has dual roles in gating cell cycle progression and promotingcellular differentiation (see, e.g., Lee & Lee, 1997, Gan To KagakuRyoho 24(11):1368-80 for a review). Nonsense mutations in the RB1 genehave been described. Nonsense mutations have been implicated inretinoblastoma (see, e.g., Klutz et al., 2002, Am J Hum Genet.71(1):174-9; Alonso et al., 2001, Hum Mutat. 17(5):412-22; Wong et al.,2000, Cancer Res. 60(21):6171-7; Harbour, 1998, Ophthalmology105(8):1442-7; Fulop et al., 1998, J Reprod Med. 43(2):119-27; Onadim etal., 1997, Br J Cancer 76(11):1405-9; Lohmann et al., 1997,Ophthalmologe 94(4):263-7; Cowell & Cragg, 1996, Eur J Cancer.32A(10):1749-52; Lohmann et al., 1996, Am J Hum Genet. 58(5):940-9;Shapiro et al., 1995, Cancer Res. 55(24):6200-9; Huang et al., 1993,Cancer Res. 53(8):1889-94; and Cheng & Haas, 1990, Mol Cell Biol.10(10):5502-9; the disclosures of which are hereby incorporated byreference in their entireties). Any gene encoding a prematuretranslation codon implicated in retinoblastoma including, but notlimited to, the nonsense mutations described in the references citedabove, can be used in the present invention to identify compounds thatmediate premature translation termination and/or nonsense-mediated mRNAdecay.

5.8.8 Wilm's Tumor

Wilm's tumor, or nephroblastoma, is an embryonal malignancy of thekidney that affects children. Nonsense mutations in the WT1 gene havebeen implicated in Wilm's tumor. Several nonsense mutations in the WT1have been identified (see, e.g., Nakadate et al., 1999, GenesChromosomes Cancer 25(l):26-32; Diller et al., 1998, J Clin Oncol.16(11):3634-40; Schumacher et al., 1997, Proc Natl Acad Sci U S A.94(8):3972-7; Coppes et al., 1993, Proc Natl Acad Sci USA. 90(4):1416-9;and Little et al., 1992, Proc Natl Acad Sci USA. 89(11):4791-5; thedisclosures of which are hereby incorporated by reference in theirentireties). Any WT1 gene encoding a premature translation codonincluding, but not limited to, the nonsense mutations described in thereferences cited above, can be used in the present invention to identifycompounds that mediate premature translation termination and/ornonsense-mediated mRNA decay.

5.8.9 Retinitis Pigmentosa

Retinitis pigmentosa is a genetic disease in which affected individualsdevelop progressive degeneration of the rod and cone photoreceptors.Retinitis pigmentosa cannot be explained by a single genetic defect butinstead the hereditary aberration responsible for triggering the onsetof the disease is localized in different genes and at different siteswithin these genes (reviewed in, e.g., Kohler et al., 1997, KlinMonatsbl Augenheilkd 211(2):84-93). Nonsense mutations have beenimplicated in retinitis pigmentosa (see, e.g., Ching et al., 2002,Neurology 58(11):1673-4; Zhang et al., 2002, Zhonghua Yi Xue Yi ChuanXue Za Zhi. 19(3):194-7; Zhang et al., 2002, Hum Mol Genet.1;11(9):993-1003; Dietrich et al., 2002, Br J Ophthalmol. 86(3):328-32;Grayson et al., 2002, J Med Genet 39(1):62-7; Liu et al., 2001, ZhonghuaYi Xue Za Zhi 81(2):71-2; Damji et al., 2001, Can J Ophthalmol.36(5):252-9; Berson et al., 2001, Invest Ophthalmol Vis Sci.42(10):2217-24; Chan et al., 2001, Br J Ophthalmol. 85(9):1046-8; Baumet al., 2001, Hum Mutat. 17(5):436; Mashima et al., 2001, OphthalmicGenet. 22(1):43-7; Zwaenepoel et al., 2001, Hum Mutat. 2001;17(1):34-41;Bork et al., 2001, Am J Hum Genet. 68(1):26-37; Sharon et al., 2000,Invest Ophthalmol Vis Sci. 41(9):2712-21; Dreyer et al., 2000, Eur J HumGenet. 8(7):500-6; Liu et al., 2000, Hum Mutat. 15(6):584; Wang et al.,1999, Exp Eye Res. 69(4):; Bowne et al., 1999, Hum Mol Genet.8(11):2121-8; Guillonneau et al., 1999, Hum Mol Genet. 8(8):1541-6;Dryja et al., 1999, Invest Ophthalmol Vis Sci. 40(8):1859-65; Sullivanet al., 1999, Nat Genet. 22(3):255-9; Pierce et al., 1999, Nat Genet.22(3):248-54; Janecke et al., 1999, Hum Mutat. 13(2):133-40; Cuevas etal., 1998, Mol Cell Probes 12(6):417-20; Schwahn et al., , 1998, NatGenet. 19(4):327-32; Buraczynska et al., 1997, Am J Hum Genet.61(6):1287-92; Meindl et al., 1996, Nat Genet. 13(1):35-42; Keen et al.,1996, Hum Mutat. 8(4):297-303; Dryja et al., 1995, Proc Natl Acad SciUSA. 92(22):10177-81; Apfelstedt-Sylla et al., 1995, Br J Ophthalmol.79(1):28-34; Bayes et al., 1995, Hum Mutat. 5(3):228-34; Shastry, 1994,Am J Med Genet. 52(4):467-74; Gal et al., 1994, Nat Genet. 7(l):648;Sargan et al., 1994, Gene Ther. 1 Suppl 1:S89; McLaughlin et al., 1993,Nat Genet. 4(2):130-4; Rosenfeld et al., 1992, Nat Genet. 1(3):209-13;the disclosures of which are hereby incorporated by reference in theirentireties). Any gene encoding a premature translation codon implicatedin retinitis pigmentosa including, but not limited to, the nonsensemutations described in the references cited above, can be used in thepresent invention to identify compounds that mediate prematuretranslation termination and/or nonsense-mediated mRNA decay.

5.8.10 Osteogenesis Imperfecta

Osteogenesis imperfecta is a heterogeneous disorder of type I collagenresulting in varying degrees of severity and results from mutations thegenes that encode the proalpha chains of type I collagen. Nonsensemutations have been implicated in the genes that encode the proalphachains of type I collagen (“COLA1” genes) (see, e.g. Slayton et al.,2000, Matrix Biol. 19(1):1-9; Bateman et al., 1999, Hum Mutat.13(4):311-7; and Willing et al., 1996, Am J Hum Genet. 59(4):799-809;the disclosures of which are hereby incorporated by reference in theirentireties). Any COLA1 gene encoding a premature translation codonincluding, but not limited to, the nonsense mutations described in thereferences cited above, can be used in the present invention to identifycompounds that mediate premature translation termination and/ornonsense-mediated mRNA decay.

5.8.11 Cirrhosis

Cirrhosis generally refers to a chronic liver disease that is marked byreplacement of normal tissue with fibrous tissue. The multidrugresistance 3 gene has been implicated in cirrhosis, and nonsensemutations have been identified in this gene (see, e.g., Jacquemin etal., 2001, Gastroenterology. 2001 May;120(6):1448-58; the disclosure ofwhich is hereby incorporated by reference in its entirety). Any geneinvolved in cirrhosis encoding a premature translation codon including,but not limited to, the nonsense mutations described in the referencecited above, can be used in the present invention to identify compoundsthat mediate premature translation termination and/or nonsense-mediatedmRNA decay.

5.8.12 Tay Sachs Disease

Tay Sachs disease is an autosomal recessive disorder affecting thecentral nervous system. The disorder results from mutations in the geneencoding the alpha-subunit of beta-hexosaminidase A, a lysosomal enzymecomposed of alpha and beta polypeptides. Several nonsense mutations havebeen implicated in Tay Sachs disease (see, e.g., Rajavel & Neufeld,2001, Mol Cell Biol. 21(16):5512-9; Myerowitz, 1997, Hum Mutat.9(3):195-208; Akli et al., 1993, Hum Genet. 90(6):614-20; Mules et al.,1992, Hum Genet. 50(4):83441; and Akli et al., 1991, Genomics.11(1):124-34; the disclosures of which are hereby incorporated byreference in their entireties). Any hexosaminidase gene encoding apremature translation codon including, but not limited to, the nonsensemutations described in the references cited above, can be used in thepresent invention to identify compounds that mediate prematuretranslation termination and/or nonsense-mediated mRNA decay.

5.8.13 Blood Disorders

Hemophilia is caused by a deficiency in blood coagulation factors.Affected individuals are at risk for spontaneous bleeding into organsand treatment usually consists of administration of clotting factors.Hemophilia A is caused by a deficiency of blood coagulation factor VIIIand hemophilia B is caused by a deficiency in blood coagulation factorIX. Nonsense mutations in the genes encoding coagulation factors havebeen implicated in hemophilia (see, e.g., Dansako et al., 2001, AnnHematol. 80(5):292-4; Moller-Morlang et al., 1999, Hum Mutat. 13(6):504;Kamiya et al., 1998, Rinsho Ketsueki 39(5):402-4; Freson et al., 1998,Hum Mutat. 11(6):470-9; Kamiya et al., 1995, Int J Hematol.62(3):175-81; Walter et al., 1994, Thromb Haemost. 72(l):74-7;Figueiredo, 1993, Braz J Med Biol Res. 26(9):919-31; Reiner & Thompson,1992, Hum Genet. 89(1):88-94; Koeberl et al., 1990, Hum Genet.84(5):387-90; Driscoll et al., 1989, Blood. 74(2):737-42; Chen et al.,1989, Am J Hum Genet. 44(4):567-9; Mikami et al., 1988, Jinrui IdengakuZasshi. 33(4):409-15; Gitschier et al., 1988, Blood 72(3):1022-8; andSommer et al., 1987, Mayo Clin Proc. 62(5):387-404; the disclosures ofwhich are hereby incorporated by reference in their entireties). Anygene encoding a premature translation codon implicated in hemophiliaincluding, but not limited to, the nonsense mutations described in thereferences cited above, can be used in the present invention to identifycompounds that mediate premature translation termination and/ornonsense-mediated mRNA decay.

Von Willebrand disease is a single-locus disorder resulting from adeficiency of von Willebrand factor: a multimeric multifunctionalprotein involved in platelet adhesion and platelet-to-platelet cohesionin high shear stress vessels, and in protecting from proteolysis anddirecting circulating factor VIII to the site of injury (reviewed inRodeghiero, 2002, Haemophilia. 8(3):292-300). Nonsense mutations haveimplicated in von Willehbrand disease (see, e.g., Rodeghiero, 2002,Haemophilia. 8(3):292-300; Enayat et al., 2001, Blood 98(3):674-80;Surdhar et al., 2001, Blood 98(l):248-50; Casana et al., 2000, Br JHaematol. 111(2):552-5; Baronciani et al., 2000, Thromb Haemost.84(4):536-40; Fellowes et al., 2000, Blood 96(2):773-5; Waseem et al.,1999, Thromb Haemost. 81(6):900-5; Mohlke et al., 1999, Int J Clin LabRes. 29(l):1-7; Rieger et al., 1998, Thromb Haemost. 80(2):332-7; Kennyet al., 1998, Blood 92(1):175-83; Mazurier et al., 1998, Ann Genet.41(1):34-43; Hagiwara et al., 1996, Thromb Haemost. 76(2):253-7;Mazurier & Meyer, 1996, Baillieres Clin Haematol. 9(2):229-41;Schneppenheim et al., 1994, Hum Genet. 94(6):640-52; Zhang et al., 1994,Genomics 21(1):188-93; Ginsburg & Sadler, 1993, Thromb Haemost.69(2):177-84; Eikenboom et al., 1992, Thromb Haemost. 68(4):448-54;Zhang et al., 1992, Am J Hum Genet. 51(4):850-8; Zhang et al., 1992, HumMol Genet. 1(1):61-2; and Mancuso et al., 1991, Biochemistry30(l):253-69; the disclosures of which are hereby incorporated byreference in their entireties). Any gene encoding a prematuretranslation codon implicated in von Willebrand disease including, butnot limited to, the nonsense mutations described in the references citedabove, can be used in the present invention to identify compounds thatmediate premature translation termination and/or nonsense-mediated mRNAdecay.

β thalassemia is caused by a deficiency in beta globin polypeptideswhich in turn causes a deficiency in hemoglobin production. Nonsensemutations have been implicated in b thalassemia (see, e.g., El-Latif etal., 2002, Hemoglobin 26(1):3340; Sanguansermsri et al., 2001,Hemoglobin 25(1):19-27; Romao 2000, Blood 96(8):2895-901; Perea et al.,1999, Hemoglobin 23(3):231-7; Rhodes et al., , 1999, Am J Med Sci.317(5):341-5; Fonseca et al., 1998, Hemoglobin 22(3):197-207; Gasperiniet al., 1998, Am J Hematol. January 1998 ;57(1):43-7; Galanello et al.,1997, Br J Haematol. 99(2):433-6; Pistidda et al., 1997, Eur J Haematol.58(5):320-5; Oner et al., 1997, Haematol. 96(2):229-34; Yasunaga et al.,1995, Intern Med. 34(12):1198-200; Molina et al., 1994, Sangre (Barc)39(4):253-6; Chang et al., 1994, Int J Hematol. 59(4):267-72; Gilman etal., 1994, Am J Hematol. 45(3):265-7; Chan et al., 1993, Prenat Diagn.13(10):977-82; George et al., 1993, Med J Malaysia 48(3):325-9; Divokyet al., 1993, Haematol. 83(3):5234; Fioretti et al., 1993, Hemoglobin17(l):9-17; Rosatelli et al., 1992, Am J Hum Genet. 50(2):422-6; Moi etal., 1992, Blood 79(2):512-6; Loudianos et al., 1992, Hemoglobin16(6):503-9; Fukurnaki, 1991, Rinsho Ketsueki 32(6):587-91; Cao et al.,1991, Am J Pediatr Hematol Oncol. 13(2):179-88; Galanello et al., 1990,Clin Genet. 38(5):327-31; Liu, 1990, Zhongguo Yi Xue Ke Xue Yuan Xue Bao12(2):90-5; Aulehla-Scholz et al., 1990, Hum Genet. 84(2):195-7; Cao etal., 1990, Ann N Y Acad Sci. 612:215-25; Sanguansermsri et al., 1990,Hemoglobin 14(2):157-68; Galanello et al., 1989, Blood 74(2):823-7;Rosatelli et al., 1989, Blood 73(2):601-5; Galanello et al., 1989, ClinBiol Res. 316B:113-21; Galanello et al., 1988, Am J Hematol. 29(2):63-6;Chan et al., 1988, Blood 72(4):1420-3; Atweh et al., 1988, J ClinInvest. 82(2):557-61; Masala et al., 1988, Hemoglobin 12(5-6):661-71;Pirastu et al., 1987, Proc Natl Acad Sci USA 84(9):2882-5; Kazazian etal., 1986, Am J Hum Genet. 38(6):860-7; Cao et al., 1986, Prenat Diagn.6(3): 159-67; Cao et al., 1985, Ann N Y Acad Sci. 1985;445:380-92;Pirastu et al., 1984, Science 223(4639):929-30; Pirastu et al., 1983, NEngl J Med. 309(5):284-7; Trecartin et al., 1981, J Clin Invest.68(4):1012-7; and Liebhaber et al., 1981, Trans Assoc Am Physicians94:88-96; the disclosures of which are hereby incorporated by referencein their entireties). Any gene encoding a premature translation codonimplicated in b thalassemia including, but not limited to, the nonsensemutations described in the references cited above, can be used in thepresent invention to identify compounds that mediate prematuretranslation termination and/or nonsense-mediated mRNA decay.

5.8.14 Kidney Stones

Kidney stones (nephrolithiasis), which affect 12% of males and 5% offemales in the western world, are familial in 45% of patients and aremost commonly associated with hypercalciuria (see, e.g., Lloyd et al.,Nature Feb. 1, 1995 ;379(6564):445-9). Mutations of the renal-specificchloride channel gene are associated with hypercalciuric nephrolithiasis(Kidney stones). Nonsense mutations have been implicated in kidneystones (see, e.g., Hoopes et al., 1998, Kidney Int. 54(3):698-705; Lloydet al., 1997, Hum Mol Genet. 6(8):1233-9; Lloyd et al., 1996, Nature379(6564):445-9; and Pras et al., 1995, Am J Hum Genet. 56(6):1297-303;the disclosures of which are hereby incorporated by reference in theirentireties). Any gene encoding a premature translation codon implicatedin kidney stones including, but not limited to, the nonsense mutationsdescribed in the references cited above, can be used in the presentinvention to identify compounds that mediate premature translationtermination and/or nonsense-mediated mRNA decay.

5.8.15 Ataxia-Telangiectasia

Ataxia-telangiectasia is characterized by increased sensitivity toionizing radiation, increased incidence of cancer, and neurodegenerationand is generally caused by mutations in the ataxia-telangiectasia gene(see, e.g., Barlow et al., 1999, Proc Natl Acad Sci USA 96(17):9915-9).Nonsense mutations have been implicated in ataxia-telangiectasia (see,e.g., Camacho et al., 2002, Blood 99(1):238-44; Pitts et al., 2001, HumMol Genet. 10(11):1155-62; Laake et al., 2000, Hum Mutat. 16(3):232-46;Li & Swift, 2000, Am J Med Genet. 92(3):170-7; Teraoka et al., 1999, AmJ Hum Genet. 64(6):1617-31; and Stoppa-Lyonnet et al., 1998, Blood91(10):3920-6; the disclosures of which are hereby incorporated byreference in their entireties). Any gene encoding a prematuretranslation codon implicated in ataxia-telangiectasia including, but notlimited to, the nonsense mutations described in the references citedabove, can be used in the present invention to identify compounds thatmediate premature translation termination and/or nonsense-mediated mRNAdecay.

5.8.16 Lysosomal Storage Diseases

There are more than 40 individually recognized lysosomal storagedisorders. Each disorder results from a deficiency in the activity of aspecific enzyme, which impedes the lysosome from carrying out its normaldegradative role. These include but are not limited to the diseaseslisted subsequently. Aspartylglucosaminuria is caused by a deficiency ofN-aspartyl-beta-glucosaminidase (Fisher et al., 1990, FEBS Left.269:440-444); cholesterol ester storage disease (Wolman disease) iscaused by mutations in the LIPA gene (Fujiyama et al., 1996, Hum. Mutat.8:377-380); mutations in the CTNS gene are associated with cystinosis(Town et al., 1998, Nature Genet 18:319-324); mutations ina-galactosidase A are associated with Fabry disease (Eng et al.,1993,Pediat. Res. 33:128A; Sakuraba et al., 1990, Am. J. Hum. Genet.47:784-789; Davies et al., 1993, Hum. Molec. Genet. 2:1051-1053;Miyamura et al., 1996, J. Clin. Invest. 98:1809-1817); fucosidosis iscaused by mutations in the FUCA1 gene(Kretz et al., 1989, J. Molec.Neurosci. 1:177-180; Yang et al., 1992, Biochem. Biophys. Res. Commun.189:1063-1068; Seo et al., 1993, Hum. Molec. Genet. 2:1205-1208);mucolipidosis type I results from mutations in the NEU1 gene (Bonten etal.,1996, Genes Dev. 10:3156-3169); mucolipidosis type IV results frommutations in the MCOLN1 gene (Bargal et al., 2000, Nature Genet.26:120-123; Sun et al., 2000,Hum. Molec. Genet. 9:2471-247S);Mucopolysaccharidosis type I (Hurler syndrome) is caused by mutations inthe IDUA gene (Scott et al., 1992, Genomics 13:1311-1313;Bach et al.,1993, Am. J. Hum. Genet. 53:330-338); Mucopolysaccharidosis type II(Hunter syndrome) is caused by mutations in the IDS gene (Sukegawa etal., 1992, Biochem. Biophys. Res. Commun. 183:809-813; Bunge et al.,1992 Hum. Molec. Genet. 1:335-339; Flomen et al., 1992, Genomics13:543-550); mucopolysaccharidosis type 25IIIB (Sanfilippo syndrome typeA) is caused by mutations in the SGSH gene(Yogalingam et al., 2001, Hum.Mutat. 18:264281); mucopolysaccharidosis type IIIB (Sanfilippo syndrome)is caused by mutations in the NAGLU gene (Zhao et al., 1996, Proc. Nat.Acad. Sci. 93:6101-6105; Zhao et al., 1995, Am. J. Hum. Genet. 57:A185);mucopolysaccharidosis type IIID is caused by mutations in theglucosamine-6-sulfatase (G6S) gene (Robertson et al., 1988, Hum. Genet.79:175-178); mucopolysaccharidosis type IVA (Morquio syndrome) is causedby mutations in the GALNS gene (Tomatsu et al., 1995, Am. J. Hum. Genet.57:556-563; Tomatsu et al.,1995, Hum. Mutat. 6:195-196);mucopolysaccharidosis type VI (Maroteaux-Lamysyndrome) is caused bymutations in the ARSB gene (Litjens et al., 1992, Hum. Mutat. 1:397-402;Isbrandt et al., 1996, Hum. Mutat. 7:361-363); mucopolysaccharidosistype VII (Sly syndrome) is caused by mutations in the beta-glucuronidase(GUSB) gene(Yamada et al., 1995, Hum. Molec. Genet. 4:651-655);mutations in CLN1 (PPT1) cause infantile neuronal ceroid lipofuscinosis(Das et al., 1998 J. Clin. Invest. 102:361-370; Mitchison et al., 1998,Hum. Molec. Genet. 7:291-297); late infantile type ceroid lipofuscinosisis caused by mutations in the CLN2 gene (Sleat et al., 1997, Science277:1802-1805); juvenile neuronal ceroid lipofuscinosis (Batten disease)is caused bymutations in the CLN3 gene (Mole et al., 1999, Hum. Mutat.14:199-215); late infantileneuronal ceroid lipofuscinosis, Finnishvariant, is caused by mutations in the CLN5 gene(Savukoski et al., 1998,Nature Genet. 19:286-288); late-infantile form of neuronal ceroidlipofuscinosis is caused by mutations in the CLN6 gene (Gao et al.,2002, Am. J. Hum. Genet. 70:324-335); Niemann-Pick disease is caused bymutations in the ASM gene (Takahashi et al., 1992, J. Biol. Chem.267:12552-12558; types A and B) and the NPC1 gene (Millat et al., 2001,Am. J. Hum. Genet. 68:1373-1385; type C); Kanzaki disease is caused bymutations in the NAGA gene (Keulemans et et al., 1996, J. Med. Genet.33:458-464); Gaucher disease is caused by mutations in the GBA gene(Stone, et al., 1999, Europ. J. Hum. Genet. 7:505-509); Glycogen storagedisease II is the prototypic lysosomal storage disease and is caused bymutations in the GAA gene(Becker et al., 1998, Am. J. Hum. Genet.62:991-994); Krabbe disease is caused by mutations in the GALC gene(Sakai et al., 1994, Biochem. Biophys. Res. Commun. 198:485491);Tay-Sachs disease is caused by mutations in the HEXA gene (Akli et al.,1991, Genomics 11:124-134; Mules et al., 1992, Am. J. Hum. Genet. 50:834-841;Triggs-Raine et al., 1991, Am. J. Hum. Genet. 49:1041-1054;Drucker et al., 1993, Hum. Mutat. 2:415-417; Shore et al., 1992, Hum.Mutat. 1:486-490); mutations in the GM2Agene causes Tay-Sachs variant AB(Schepers et al., 1996, Am. J. Hum. Genet. 59:1048-1056; Chen et al.,1999, Am. J. Hum. Genet. 65:77-87); mutations in the HEXB gene causeSandhoff disease (Zhang et al., 1994, Hum Mol Genet 3:139-145);alphamannosidosis type II is caused by mutations in the MAN2B1 gene(Gotoda et al., 1998, Am. J. Hum. Genet. 63:1015-1024; Autio et al.,1973, Acta Paediat. Scand. 62:555-565); metachromatic leukodystrophy iscaused by mutations in the ARSA gene(Gieselmann et al., 1994, Hum.Mutat. 4:233-242). Any gene containing a premature translation codonimplicated in lysosomal storage disease disorders including, but notlimited to, the nonsense mutations and genes described in the referencescited above, can be used in the present invention to identify compoundsthat mediate premature translation termination and/or nonsense-mediatedmRNA decay.

5.8.17 Tuberous Sclerosis

Tuberous sclerosis complex (TSC) is a dominantly inherited diseasecharacterized by the presence of hamartomata in multiple organ systems.The disease is caused bymutations in TSC1 (van Slegtenhorst et al., 1997Science 277:805-808; Sato et al., 2002, J. Hum. Genet. 47:20-28) and/orTSC2 (Vrtel et al., 1996, J. Med. Genet. 33:47-51; Wilson et al., 1996,Hum. Molec. Genet. 5:249-256; Au et al., 1998, Am. J. Hum. Genet.62:286-294; Verhoef et al., 1999, Europ. J. Pediat. 158:284-287;Carsilloet al., 2000, Proc. Nat. Acad. Sci. 97:6085-6090). Any gene containing apremature translation codon implicated in tuberous sclerosis including,but not limited to, the nonsense mutations described in the referencescited above, can be used in the present invention to identify compoundsthat mediate premature translation termination and/or nonsense-mediatedmRNA decay.

5.9 Secondary Biological Screens 5.9.1 In Vitro Assays

The compounds identified in the nonsense suppression assay (forconvenience referred to herein as a “lead” compound) can be tested forbiological activity using host cells containing or engineered to containthe target RNA element coupled to a functional readout system.

5.9.1.1 Reporter Gene Assays

The lead compound can be tested in a host cell engineered to contain theRNA with a premature translation termination codon controlling theexpression of a reporter gene. In this example, the lead compounds areassayed in the presence or absence of the RNA with the prematuretranslation termination codon. Compounds that modulate prematuretranslation termination and/or nonsense-mediated mRNA decay will resultin increased expression of the full-length gene, i.e., past thepremature termination codon. Alternatively, a phenotypic orphysiological readout can be used to assess activity of the target RNAwith the premature translation termination codon in the presence andabsence of the lead compound. In another embodiment of the invention,the compounds identified in the nonsense suppression assay (forconvenience referred to herein as a “lead” compound) can also be testedfor biological activity using an in vitro transcribed RNA from the genewith a premature translation termination codon and subsequent in vitrotranslation of that RNA in a cell-free translation extract. The activityof the lead compound in the in vitro translation mixture can bedetermined by any method that measures increased expression of thefill-length gene, i.e., past the premature termination codon. Forexample, expression of a functional protein from the full-length gene(e.g., a reporter gene) can be measured to determine the effect of thelead compound on premature translation termination and/ornonsense-mediated mRNA decay in an in vitro system. Both the in vitroand in vivo nonsense suppression assays described in InternationalPatent Publication No. WO 01/44516 and International Patent ApplicationNo. PCT/US03/19760, each of which is incorporated by reference in itsentirety.

5.9.1.1.1 Reporter Gene Constructs, Transfected Cells and Cell-FreeExtracts

The invention provides for reporter genes to ascertain the effects of acompound on premature translation termination and/or nonsense-mediatedmRNA decay. In general, the level of expression and/or activity of areporter gene product is indicative of the effect of the compound onpremature translation termination and/or nonsense-mediated mRNA decay.

The invention provides for specific vectors comprising a reporter geneoperably linked to one or more regulatory elements and host cellstransfected with the vectors. The invention also provides for the invitro translation of a reporter gene flanked by one or more regulatoryelements. A reporter gene may or may not contain a premature stop codondepending on the assay conducted. Techniques for practicing thisspecific aspect of this invention will employ, unless otherwiseindicated, conventional techniques of molecular biology, microbiology,and recombinant DNA manipulation and production, which are routinelypracticed by one of skill in the art. See, e.g., Sambrook, 1989,Molecular Cloning, A Laboratory Manual, Second Edition; DNA Cloning,Volumes I and II (Glover, Ed. 1985); Oligonucleotide Synthesis (Gait,Ed. 1984); Nucleic Acid Hybridization (Hames & Higgins, Eds. 1984);Transcription and Translation (Hames & Higgins, Eds. 1984); Animal CellCulture (Freshney, Ed. 1986); Immobilized Cells and Enzymes (IRL Press,1986); Perbal, A Practical Guide to Molecular Cloning (1984); GeneTransfer Vectors for Mammalian Cells (Miller & Calos, Eds. 1987, ColdSpring Harbor Laboratory); Methods in Enzymology, Volumes 154 and 155(Wu & Grossman, and Wu, Eds., respectively), (Mayer & Walker, Eds.,1987); Immunochemical Methods in Cell and Molecular Biology (AcademicPress, London, Scopes, 1987), Expression of Proteins in Mammalian CellsUsing Vaccinia Viral Vectors in Current Protocols in Molecular Biology,Volume 2 (Ausubel et al., Eds., 1991).

5.9.1.1.1.1 Reporter Genes

Any reporter gene well-known to one of skill in the art may be used inreporter gene constructs to ascertain the effect of a compound onpremature translation termination. Reporter genes refer to a nucleotidesequence encoding a protein, polypeptide or peptide that is readilydetectable either by its presence or activity. Reporter genes may beobtained and the nucleotide sequence of the elements determined by anymethod well-known to one of skill in the art. The nucleotide sequence ofa reporter gene can be obtained, e.g., from the literature or a databasesuch as GenBank. Alternatively, a polynucleotide encoding a reportergene may be generated from nucleic acid from a suitable source. If aclone containing a nucleic acid encoding a particular reporter gene isnot available, but the sequence of the reporter gene is known, a nucleicacid encoding the reporter gene may be chemically synthesized orobtained from a suitable source (e.g. a cDNA library, or a cDNA librarygenerated from, or nucleic acid, preferably poly A+RNA, isolated from,any tissue or cells expressing the reporter gene) by PCR amplification.Once the nucleotide sequence of a reporter gene is determined, thenucleotide sequence of the reporter gene may be manipulated usingmethods well-known in the art for the manipulation of nucleotidesequences, e.g., recombinant DNA techniques, site directed mutagenesis,PCR, etc. (see, for example, the techniques described in Sambrook etal., 1990, Molecular Cloning, A Laboratory Manual, 2d Ed., Cold SpringHarbor Laboratory, Cold Spring Harbor, N.Y. and Ausubel et al., eds.,1998, Current Protocols in Molecular Biology, John Wiley & Sons, NY,which are both incorporated by reference herein in their entireties), togenerate reporter genes having a different amino acid sequence, forexample to create amino acid substitutions, deletions, and/orinsertions.

In a specific embodiment, a reporter gene is any naturally-occurringgene with a premature stop codon. Genes with premature stop codons thatare useful in the present invention include, but are not limited to, thegenes described below. In an alternative embodiment, a reporter gene isany gene that is not known in nature to contain a premature stop codon.Examples of reporter genes include, but are not limited to, luciferase(e.g., firefly luciferase, renilla luciferase, and click beetleluciferase), green fluorescent protein (“GFP”) (e.g., green fluorescentprotein, yellow fluorescent protein, red fluorescent protein, cyanfluorescent protein, and blue fluorescent protein), beta-galactosidase(“beta-gal”), beta-glucoronidase, beta-lactamase, chloramphenicolacetyltransferase (“CAT”), and alkaline phosphatase (“AP”).Alternatively, a reporter gene can also be a protein tag, such as, butnot limited to, myc, His, FLAG, or GST, so that nonsense suppressionwill produce the peptide and the protein can be monitored by an ELISA, awestern blot, or any other immunoassay to detect the protein tag. Suchmethods are well known to one of skill in the art. In a preferredembodiment, the reporter gene is easily assayed and has an activitywhich is not normally found in the gene of interest. Table 27 belowlists various reporter genes and the properties of the products of thereporter genes that can be assayed. In a preferred embodiment, areporter gene utilized in the reporter constructs is easily assayed andhas an activity which is not normally found in the cell or organism ofinterest. TABLE 27 Reporter Genes and the Properties of the ReporterGene Products Reporter Gene Protein Activity & Measurement CAT(chloramphenicol Transfers radioactive acetyl groups toacetyltransferase) chloramphenicol or detection by thin layerchromatography and autoradiography GAL (beta-galactosidase) Hydrolyzescolorless galactosides to yield colored products. GUS Hydrolyzescolorless glucuronides to yield (beta-glucuronidase) colored products.LUC (luciferase) Oxidizes luciferin, emitting photons GFP (greenfluorescent Fluorescent protein without substrate protein) SEAP(secreted alkaline Luminescence reaction with suitable substratesphosphatase) or with substrates that generate chromophores HRP(horseradish In the presence of hydrogen oxide, oxidation of peroxidase)3,3′,5,5′-tetramethylbenzidine to form a colored complex AP (alkalineLuminescence reaction with suitable substrates phosphatase) or withsubstrates that generate chromophores

Described hereinbelow in further detailed are specific reporter genesand characteristics of those reporter genes.

Luciferase

Luciferases are enzymes that emit light in the presence of oxygen and asubstrate (luciferin) and which have been used for real-time, low-lightimaging of gene expression in cell cultures, individual cells, wholeorganisms, and transgenic organisms (reviewed by Greer & Szalay, 2002,Luminescence 17(1):43-74).

As used herein, the term “luciferase” is intended to embrace allluciferases, or recombinant enzymes derived from luciferases which haveluciferase activity. The luciferase genes from fireflies have been wellcharacterized, for example, from the Photinus and Luciola species (see,e.g., International Patent Publication No. WO 95/25798 for Photinuspyralis, European Patent Application No. EP 0 524 448 for Luciolacruciata and Luciola lateralis, and Devine et al., 1993, Biochim.Biophys. Acta 1173(2):121-132 for Luciola mingrelica). Other eucaryoticluciferase genes include, but are not limited to, the click beetle(Photinus plagiophthalamus, see, e.g., Wood et al., 1989, Science244:700-702), the sea panzy (Reinilla reniformis, see, e.g., Lorenz etal., 1991, Proc Natl Acad Sci USA 88(10):4438-4442), and the glow worm(Lampyris noctiluca, see e.g., Sula-Newby et al., 1996, Biochem J.313:761-767). The click beetle is unusual in that different members ofthe species emit bioluminescence of different colors, which emit lightat 546 nm (green), 560 nm (yellow-green), 578 nm (yellow) and 593 nm(orange) (see, e.g, U.S. Pat. Nos. 6,475,719; 6,342,379; and 6,217,847,the disclosures of which are incorporated by reference in theirentireties). Bacterial luciferin-luciferase systems include, but are notlimited to, the bacterial lux genes of terrestrial Photorhabdusluminescens (see, e.g., Manukhov et al., 2000, Genetika 36(3):322-30)and marine bacteria Vibrio fischeri and Vibrio harveyi (see, e.g.,Miyamoto et al., 1988, J Biol Chem. 263(26):13393-9, and Cohn et al.,1983, Proc Natl Acad Sci USA., 80(1):120-3, respectively). Theluciferases encompassed by the present invention also includes themutant luciferases described in U.S. Pat. No. 6,265,177 to Squirrell etal., which is hereby incorporated by reference in its entirety.

In a specific embodiment, the luciferase is a firefly luciferase, arenilla luciferase, or a click beetle luciferase, as described in anyone of the references listed supra, the disclosures of which areincorporated by reference in their entireties.

Green Fluorescent Protein

Green fluorescent protein (“GFP”) is a 238 amino acid protein with aminoacid residues 65 to 67 involved in the formation of the chromophorewhich does not require additional substrates or cofactors to fluoresce(see, e.g., Prasher et al., 1992, Gene 111:229-233; Yang et al., 1996,Nature Biotechnol. 14:1252-1256; and Cody et al., 1993, Biochemistry32:1212-1218).

As used herein, the term “green fluorescent protein” or “GFP” isintended to embrace all GFPs (including the various forms of GFPs whichexhibit colors other than green), or recombinant enzymes derived fromGFPs which have GFP activity. In a preferred embodiment, GFP includesgreen fluorescent protein, yellow fluorescent protein, red fluorescentprotein, cyan fluorescent protein, and blue fluorescent protein. Thenative gene for GFP was cloned from the bioluminescent jellyfishAequorea Victoria (see, e.g., Morin et al., 1972, J. Cell Physiol.77:313-318). Wild type GFP has a major excitation peak at 395 nm and aminor excitation peak at 470 nm. The absorption peak at 470 nm allowsthe monitoring of GFP levels using standard fluorescein isothiocyanate(FITC) filter sets. Mutants of the GFP gene have been found useful toenhance expression and to modify excitation and fluorescence. Forexample, mutant GFPs with alanine, glycine, isoleucine, or threoninesubstituted for serine at position 65 result in mutant GFPs with shiftsin excitation maxima and greater fluorescence than wild type proteinwhen excited at 488 nm (see, e.g., Heim et al., 1995, Nature373:663-664; U.S. Pat. No. 5,625,048; Delagrave et al., 1995,Biotechnology 13:151-154; Cormack et al., 1996, Gene 173:33-38; andCramer et al., 1996, Nature Biotechnol. 14:315-319). The ability toexcite GFP at 488 nm permits the use of GFP with standard fluorescenceactivated cell sorting (“FACS”) equipment. In another embodiment, GFPsare isolated from organisms other than the jellyfish, such as, but notlimited to, the sea pansy, Reniilla reniformis.

Techniques for labeling cells with GFP in general are described in U.S.Pat. Nos. 5,491,084 and 5,804,387, which are incorporated by referencein their entireties; Chalfie et al., 1994, Science 263:802-805; Heim etal., 1994, Proc. Natl. Acad. Sci. USA 91:12501-12504; Morise et al.,1974, Biochemistry 13:2656-2662; Ward et al., 1980, Photochem.Photobiol. 31:611-615; Rizzuto et al., 1995, Curr. Biology 5:635-642;and Kaether & Gerdes, 1995, FEBS Lett 369:267-271. The expression ofGFPs in E. coli and C. elegans are described in U.S. Pat. No. 6,251,384to Tan et al., which is incorporated by reference in its entirety. Theexpression of GFP in plant cells is discussed in Hu & Cheng, 1995, FEBSLett 369:331-33, and GFP expression in Drosophila is described in Daviset al., 1995, Dev. Biology 170:726-729.

Beta Galactosidase

Beta galactosidase (“beta-gal”) is an enzyme that catalyzes thehydrolysis of beta-galactosides, including lactose, and the galactosideanalogs o-nitrophenyl-beta-D-galactopyranoside ( “ONPG”) andchlorophenol red-beta-D-galactopyranoside (“CPRG”) (see, e.g., Nielsenet al., 1983 Proc Natl Acad Sci USA 80(17):5198-5202; Eustice et al.,1991, Biotechniques 11:739-742; and Henderson et al., 1986, Clin. Chem.32:1637-1641). The beta-gal gene functions well as a reporter genebecause the protein product is extremely stable, resistant toproteolytic degradation in cellular lysates, and easily assayed. WhenONPG is used as the substrate, beta-gal activity can be quantitated witha spectrophotometer or microplate reader.

As used herein, the term “beta galactosidase” or “beta-gal” is intendedto embrace all beta-gals, including lacZ gene products, or recombinantenzymes derived from beta-gals which have beta-gal activity. Thebeta-gal gene functions well as a reporter gene because the proteinproduct is extremely stable, resistant to proteolytic degradation incellular lysates, and easily assayed. In an embodiment where ONPG is thesubstrate, beta-gal activity can be quantitated with a spectrophotometeror microplate reader to determine the amount of ONPG converted at 420nm. In an embodiment when CPRG is the substrate, beta-gal activity canbe quantitated with a spectrophotometer or microplate reader todetermine the amount of CPRG converted at 570 to 595 nm. In yet anotherembodiment, the beta-gal activity can be visually ascertained by platingbacterial cells transformed with a beta-gal construct onto platescontaining Xgal and IPTG. Bacterial colonies that are dark blue indicatethe presence of high beta-gal activity and colonies that are varyingshades of blue indicate varying levels of beta-gal activity.

Beta-Glucuronidase

Beta-glucuronidase (“GUS”) catalyzes the hydrolysis of a very widevariety of beta-glucuronides, and, with much lower efficiency,hydrolyzes some beta-galacturonides. GUS is very stable, will toleratemany detergents and widely varying ionic conditions, has no cofactors,nor any ionic requirements, can be assayed at any physiological pH, withan optimum between 5.0 and 7.8, and is reasonably resistant to thermalinactivation (see, e.g., U.S. Pat. No. 5,268,463, which is incorporatedby reference in its entirety).

In one embodiment, the GUS is derived from the Esherichia colibeta-glucuronidase gene. In alternate embodiments of the invention, thebeta-glucuronidase encoding nucleic acid is homologous to the E. colibeta-glucuronidase gene and/or may be derived from another organism orspecies.

GUS activity can be assayed either by fluorescence or spectrometry, orany other method described in U.S. Pat. No. 5,268,463, the disclosure ofwhich is incorporated by reference in its entirety. For a fluorescentassay, 4-trifluoromethylumbelliferyl beta-D-glucuronide is a verysensitive substrate for GUS. The fluorescence maximum is close to 500nm—bluish green, where very few plant compounds fluoresce or absorb.4-trifluoromethylumbelliferyl beta-D-glucuronide also fluoresces muchmore strongly near neutral pH, allowing continuous assays to beperformed more readily than with MUG. 4-trifluoromethylumbelliferylbeta-D-glucuronide can be used as a fluorescent indicator in vivo. Thespectrophotometric assay is very straightforward and moderatelysensitive (Jefferson et al., 1986, Proc. Natl. Acad. Sci. USA86:8447-8451). A preferred substrate for spectrophotometric measurementis p-nitrophenyl beta-D-glucuronide, which when cleaved by GUS releasesthe chromophore p-nitrophenol. At a pH greater than its pK_(a) (around7.15) the ionized chromophore absorbs light at 400-420 nm, giving ayellow color.

Beta-Lactamases

Beta-lactamases are nearly optimal enzymes in respect to their almostdiffusion-controlled catalysis of beta-lactam hydrolysis, making themsuited to the task of an intracellular reporter enzyme (see, e.g.,Christensen et al., 1990, Biochem. J. 266: 853-861). They cleave thebeta-lactam ring of beta-lactam antibiotics, such as penicillins andcephalosporins, generating new charged moieties in the process (see,e.g. O'Callaghan et al., 1968, Antimicrob. Agents. Chemother. 8: 57-63and Stratton, 1988, J. Antimicrob. Chemother. 22, Suppl. A: 23-35). Alarge number of beta-lactamases have been isolated and characterized,all of which would be suitable for use in accordance with the presentinvention (see, e.g., Richmond & Sykes, 1978, Adv. Microb. Physiol.9:31-88 and Ambler, 1980, Phil. Trans. R. Soc. Lond. [Ser.B.] 289:321-331, the disclosures of which are incorporated by reference in theirentireties).

The coding region of an exemplary beta-lactamase employed has beendescribed in U.S. Pat. No. 6,472,205, Kadonaga et al., 1984, J. Biol.Chem. 259: 2149-2154, and Sutcliffe, 1978, Proc. Natl. Acad. Sci. USA75: 3737-3741, the disclosures of which re incorporated by reference intheir entireties. As would be readily apparent to those skilled in thefield, this and other comparable sequences for peptides havingbeta-lactamase activity would be equally suitable for use in accordancewith the present invention. The combination of a fluorogenic substratedescribed in U.S. Pat. Nos. 6,472,205, 5,955,604, and 5,741,657, thedisclosures of which are incorporated by reference in their entireties,and a suitable beta-lactamase can be employed in a wide variety ofdifferent assay systems, such as are described in U.S. Pat. No.4,740,459, which is hereby incorporated by reference in its entirety.

Chloramphenicol Acetyltransferase

Chloramphenicol acetyl transferase (“CAT”) is commonly used as areporter gene in mammalian cell systems because mammalian cells do nothave detectable levels of CAT activity. The assay for CAT involvesincubating cellular extracts with radiolabeled chloramphenicol andappropriate co-factors, separating the starting materials from theproduct by, for example, thin layer chromatography (“TLC”), followed byscintillation counting (see, e.g., U.S. Pat. No. 5,726,041, which ishereby incorporated by reference in its entirety).

As used herein, the term “chloramphenicol acetyltransferase” or “CAT” isintended to embrace all CATs, or recombinant enzymes derived from CATwhich have CAT activity. While it is preferable that a reporter systemwhich does not require cell processing, radioisotopes, andchromatographic separations would be more amenable to high through-putscreening, CAT as a reporter gene may be preferable in situations whenstability of the reporter gene is important. For example, the CATreporter protein has an in vivo half life of about 50 hours, which isadvantageous when an accumulative versus a dynamic change type of resultis desired.

Secreted Alkaline Phosphatase

The secreted alkaline phosphatase (“SEAP”) enzyme is a truncated form ofalkaline phosphatase, in which the cleavage of the transmembrane domainof the protein allows it to be secreted from the cells into thesurrounding media. In a preferred embodiment, the alkaline phosphataseis isolated from human placenta.

As used herein, the term “secreted alkaline phosphatase” or “SEAP” isintended to embrace all SEAP or recombinant enzymes derived from SEAPwhich have alkaline phosphatase activity. SEAP activity can be detectedby a variety of methods including, but not limited to, measurement ofcatalysis of a fluorescent substrate, immunoprecipitation, HPLC, andradiometric detection. The luminescent method is preferred due to itsincreased sensitivity over calorimetric detection methods. Theadvantages of using SEAP is that a cell lysis step is not required sincethe SEAP protein is secreted out of the cell, which facilitates theautomation of sampling and assay procedures. A cell-based assay usingSEAP for use in cell-based assessment of inhibitors of the Hepatitis Cvirus protease is described in U.S. Pat. No. 6,280,940 to Potts et al.which is hereby incorporated by reference in its entirety.

5.9.1.1.1.2 Stop Codons

The present invention provides for methods for screening and identifyingcompounds that modulate premature translation termination and/ornonsense-mediated mRNA decay. A reporter gene may be engineered tocontain a premature stop codon or may naturally contain a premature stopcodon. Alternatively, a protein, polypeptide or peptide that regulates(directly or indirectly) the expression of a reporter gene may beengineered to contain or may naturally contain a premature stop codon.The premature stop codon may any one of the stop codons known in the artincluding UAG, UAA and UGA.

In a specific embodiment, a reporter gene contains or is engineered tocontain the premature stop codon UAG. In another embodiment, a reportergene contains or is engineered to contain the premature stop codon UGA.

In a particular embodiment, a reporter gene contains or is engineered tocontain two or more stop codons. In accordance with this embodiment, thestop codons are preferably at least 10 nucleotides, at least 15nucleotides, at least 20 nucleotides, at least 25 nucleotides, at least30 nucleotides, at least 35 nucleotides, at least 40 nucleotides, atleast 45 nucleotides, at least 50 nucleotides, at least 75 nucleotidesor at least 100 nucleotides apart from each other. Further, inaccordance with this embodiment, at least one of the stop codons ispreferably UAG or UGA.

In a specific embodiment, a reporter gene contains or is engineered tocontain a premature stop -codon at least 15 nucleotides, preferably atleast 20 nucleotides, at least 25 nucleotides, at least 30 nucleotides,at least 35 nucleotides, at least 40 nucleotides, at least 45nucleotides, at least 50 nucleotides or at least 75 nucleotides from thestart codon in the coding sequence. In another embodiment, a reportergene contains or is engineered to contain a premature stop codon atleast 15 nucleotides, preferably at least 25 nucleotides, at least 50nucleotides, at least 75 nucleotides, at least 100 nucleotides, at least125 nucleotides, at least 150, at least 175 nucleotides or at least 200nucleotides from the native stop codon in the coding sequence of thefull-length reporter gene product or protein, polypeptide or peptide. Inanother embodiment, a reporter gene contains or is engineered to containa premature stop codon at least 15 nucleotides (preferably at least 20nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least35 nucleotides, at least 40 nucleotides, at least 45 nucleotides, atleast 50 nucleotides or at least 75 nucleotides) from the start codon inthe coding sequence and at least 15 nucleotides (preferably at least 25nucleotides, at least 50 nucleotides, at least 75 nucleotides, at least100 nucleotides, at least 125 nucleotides, at least 150, at least 175nucleotides or at least 200 nucleotides) from the native stop codon inthe coding sequence of the full-length reporter gene product or protein,polypeptide or peptide. In accordance with these embodiments, thepremature stop codon is preferably UAG or UGA.

The premature translation stop codon can be produced by in vitromutagenesis techniques such as, but not limited to, polymerase chainreaction (“PCR”), linker insertion, oligonucleotide-mediatedmutagenesis, and random chemical mutagenesis.

5.9.1.1.1.3 Vectors

The nucleotide sequence encoding for a protein, polypeptide or peptide(e.g., a reporter gene), can be inserted into an appropriate expressionvector, i.e., a vector which contains the necessary elements for thetranscription and translation of the inserted protein-coding sequence.The necessary transcriptional and translational elements can also besupplied by the protein, polypeptide or peptide. The regulatory regionsand enhancer elements can be of a variety of origins, both natural andsynthetic. In a specific embodiment, a reporter gene is operably linkedto regulatory element that is responsive to a regulatory protein whoseexpression is dependent upon the suppression of a premature stop codon.

A variety of host-vector systems may be utilized to express a protein,polypeptide or peptide. These include, but are not limited to, mammaliancell systems infected with virus (e.g., vaccinia virus, adenovinis,etc.); insect cell systems infected with virus (e.g., baculovirus);microorganisms such as yeast containing yeast vectors, or bacteriatransformed with bacteriophage, DNA, plasmid DNA, or cosmid DNA; andstable cell lines generated by transformation using a'selectable marker.The expression elements of vectors vary in their strengths andspecificities. Depending on the host-vector system utilized, any one ofa number of suitable transcription and translation elements may be used.

Any of the methods previously described for the insertion of DNAfragments into a vector may be used to construct expression vectorscontaining a chimeric nucleic acid consisting of appropriatetranscriptional/translational control signals and the protein codingsequences. These methods may include in vitro recombinant DNA andsynthetic techniques and in vivo recombinants (genetic recombination).Expression of a first nucleic acid sequence encoding a protein,polypeptide or peptide, such as a reporter gene, may be regulated by asecond nucleic acid sequence so that the first nucleic acid sequence isexpressed in a host transformed with the second nucleic acid sequence.For example, expression of a nucleic acid sequence encoding a protein,polypeptide or peptide, such as a reporter gene, may be controlled byany promoter/enhancer element known in the art, such as a constitutivepromoter, a tissue-specific promoter, or an inducible promoter. Specificexamples of promoters which may be used to control gene expressioninclude, but are not limited to, the SV40 early promoter region(Bernoist & Chambon, 1981, Nature 290:304-310), the promoter containedin the 3′ long terminal repeat of Rous sarcoma virus (Yamamoto et al.,1980, Cell 22:787-797), the herpes thymidine kinase promoter (Wagner etal., 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatorysequences of the metallothionein gene (Brinster et al., 1982, Nature296:39-42); prokaryotic expression vectors such as the β-lactamasepromoter (Villa-Kamaroff et al., 1978, Proc. Natl. Acad. Sci. U.S.A.75:3727-3731), or the tac promoter (DeBoer et al., 1983, Proc. Natl.Acad. Sci. U.S.A. 80:21-25); see also “Useful proteins from recombinantbacteria” in Scientific American, 1980, 242:74-94; plant expressionvectors comprising the nopaline synthetase promoter region(Herrera-Estrella et al., Nature 303:209-213) or the cauliflower mosaicvirus 35S RNA promoter (Gardner, et al., 1981, Nucl. Acids Res. 9:2871),and the promoter of the photosynthetic enzyme ribulose biphosphatecarboxylase (Herrera-Estrella et al., 1984, Nature 310:115-120);promoter elements from yeast or other fungi such as the Gal 4 promoter,the ADC (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase)promoter, alkaline phosphatase promoter, and the following animaltranscriptional control regions, which exhibit tissue specificity andhave been utilized in transgenic animals: elastase I gene control regionwhich is active in pancreatic acinar cells (Swift et al., 1984, Cell38:639-646; Ornitz et al., 1986, Cold Spring Harbor Symp. Quant. Biol.50:399-409; MacDonald, 1987, Hepatology 7:425-515); insulin gene controlregion which is active in pancreatic beta cells (Hanahan, 1985, Nature315:115-122), immunoglobulin gene control region which is active inlymphoid cells (Grosschedl et al., 1984, Cell 38:647-658; Adames et al.,1985, Nature 318:533-538; Alexander et al., 1987, Mol. Cell. Biol.7:1436-1444), mouse mammary tumor virus control region which is activein testicular, breast, lymphoid and mast cells (Leder et al., 1986, Cell45:485495), albumin gene control region which is active in liver(Pinkert et al., 1987, Genes and Devel. 1:268-276), alpha-fetoproteingene control region which is active in liver (Krumlauf et al., 1985,Mol. Cell. Biol. 5:1639-1648; Hammer et al., 1987, Science 235:53-58;alpha 1-antitrypsin gene control region which is active in the liver(Kelsey et al., 1987, Genes and Devel. 1:161-171), beta-globin genecontrol region which is active in myeloid cells (Mogram et al., 1985,Nature 315:338-340; Kollias et al., 1986, Cell 46:89-94; myelin basicprotein gene control region which is active in oligodendrocyte cells inthe brain (Readhead et al., 1987, Cell 48:703-712); myosin light chain-2gene control region which is active in skeletal muscle (Sani, 1985,Nature 314:283-286), and gonadotropic releasing hormone gene controlregion which is active in the hypothalamus (Mason et al., 1986, Science234:1372-1378).

In a specific embodiment, a vector is used that comprises a promoteroperably linked to a reporter gene, one or more origins of replication,and, optionally, one or more selectable markers (e.g., an antibioticresistance gene). In a preferred embodiment, the vectors are CMVvectors, T7 vectors, lac vectors, pCEP4 vectors, 5.0/F vectors, orvectors with a tetracycline-regulated promoter (e.g., pcDNA™5/FRT/TOfrom Invitrogen). Some vectors may be obtained commercially.Non-limiting examples of useful vectors are described in Appendix 5 ofCurrent Protocols in Molecular Biology, 1988, ed. Ausubel et al., GreenePublish. Assoc. & Wiley Interscience, which is incorporated herein byreference; and the catalogs of commercial suppliers such as ClontechLaboratories, Stratagene Inc., and Invitrogen, Inc.

Expression vectors containing a construct of the present invention canbe identified by the following general approaches: (a) nucleic acidhybridization, (b) presence or absence of “marker” nucleic acidfunctions, (c) expression of inserted sequences, and (d) sequencing. Inthe first approach, the presence of a particular nucleic acid sequenceinserted in an expression vector can be detected by nucleic acidhybridization using probes comprising sequences that are homologous tothe inserted nucleic acid sequence. In the second approach, therecombinant vector/host system can be identified and selected based uponthe presence or absence of certain “marker” nucleic acid functions(e.g., thymidine kinase activity, resistance to antibiotics,transformation phenotype, occlusion body formation in baculovirus, etc.)caused by the insertion of the nucleic acid sequence of interest in thevector. For example, if the nucleic acid sequence of interest isinserted within the marker nucleic acid sequence of the vector,recombinants containing the insert can be identified by the absence ofthe marker nucleic acid function. In the third approach, recombinantexpression vectors can be identified by assaying the product expressedby the recombinant. Such assays can be based, for example, on thephysical or functional properties of the particular nucleic acidsequence.

In a preferred embodiment, nucleic acid sequences encoding proteins,polypeptides or peptides are cloned into stable cell line expressionvectors. In a preferred embodiment, the stable cell line expressionvector contains a site specific genomic integration site. In anotherpreferred embodiment, the reporter gene construct is cloned into anepisomal mammalian expression vector.

5.9.1.1.1.4 Transfection

Once a vector encoding the appropriate gene has been synthesized, a hostcell is transformed or transfected with the vector of interest. The useof stable transformants is preferred. In a preferred embodiment, thehost cell is a mammalian cell. In a more preferred embodiment, the hostcell is a human cell. In another embodiment, the host cells are primarycells isolated from a tissue or other biological sample of interest.Host cells that can be used in the methods of the present inventioninclude, but are not limited to, hybridomas, pre-B cells, 293 cells,293T cells, HeLa cells, HepG2 cells, K562 cells, 3T3 cells. In anotherpreferred embodiment, the host cells are derived from tissue specific tothe nucleic acid sequence encoding a protein, polypeptide or peptide. Inanother preferred embodiment, the host cells are immortalized cell linesderived from a source, e.g., a tissue. Other host cells that can be usedin the present invention include, but are not limited to, bacterialcells, yeast cells, virally-infected cells, or plant cells.

Preferred mammalian host cells include but are not limited to thosederived from humans, monkeys and rodents, (see, for example, Kriegler M.in “Gene Transfer and Expression: A Laboratory Manual”, New York,Freeman & Co. 1990), such as monkey kidney cell line transformed by SV40(COS-7, ATCC Accession No. CRL 1651); human embryonic kidney cell lines(293, 293-EBNA, or 293 cells subcloned for growth in suspension culture,Graham et al., J. Gen. Virol., 36:59, 1977; baby hamster kidney cells(BHK, ATCC Accession No. CCL 10); chinese hamster ovary-cells-DHFR (CHO,Urlaub and Chasin. Proc. Natl. Acad. Sci. 77; 4216, 1980); mouse sertolicells (Mather, Biol. Reprod. 23:243-251, 1980); mouse fibroblast cells(NIH-3T3), monkey kidney cells (CVI ATCC Accession No. CCL 70); africangreen monkey kidney cells (VERO76, ATCC Accession No. CRL-1587); humancervical carcinoma cells (HELA, ATCC Accession No. CCL 2); canine kidneycells (MDCK, ATCC Accession No. CCL 34); buffalo rat liver cells (BRL3A, ATCC Accession No. CRL 1442); human lung cells (W138, ATCC AccessionNo. CCL 75); human liver cells (Hep G2, HB 8065); and mouse mammarytumor cells (MMT 060562, ATCC Accession No. CCL51).

Other useful eukaryotic host-vector system may include yeast and insectsystems. In yeast, a number of vectors containing constitutive orinducible promoters may be used with Saccharomyces cerevisiae (baker'syeast), Schizosaccharomyces pombe (fission yeast), Pichia pastoris, andHansenula polymorpha (methylotropic yeasts). For a review see, CurrentProtocols in Molecular Biology, Vol. 2, 1988, Ed. Ausubel et al., GreenePublish. Assoc. & Wiley Interscience, Ch. 13; Grant et al., 1987,Expression and Secretion Vectors for Yeast, in Methods in Enzymology,Eds. Wu & Grossman, 1987, Acad. Press, N.Y., Vol. 153, pp. 516-544;Glover, 1986, DNA Cloning, Vol. II, IRL Press, Wash., D.C., Ch. 3; andBitter, 1987, Heterologous Gene Expression in Yeast, Methods inEnzymology, Eds. Berger & Kimmel, Acad. Press, N.Y., Vol. 152, pp.673-684; and The Molecular Biology of the Yeast Saccharomyces, 1982,Eds. Strathern et al., Cold Spring Harbor Press, Vols. I and II.

Standard methods of introducing a nucleic acid sequence of interest intohost cells can be used. Transformation may be by any known method forintroducing polynucleotides into a host cell, including, for examplepackaging the polynucleotide in a virus and transducing a host cell withthe virus, and by direct uptake of the polynucleotide. Thetransformation procedure used depends upon the host to be transformed.Mammalian transformations (i.e., transfections) by direct uptake may beconducted using the calcium phosphate precipitation method of Graham &Van der Eb, 1978, Virol. 52:546, or the various known modificationsthereof. Other methods for introducing recombinant polynucleotides intocells, particularly into mammalian cells, include dextran-mediatedtransfection, calcium phosphate mediated transfection, polybrenemediated transfection, protoplast fusion, electroporation, encapsulationof the polynucleotide(s) in liposomes, and direct microinjection of thepolynucleotides into nuclei. Such methods are well-known to one of skillin the art.

In a preferred embodiment, stable cell lines containing the constructsof interest are generated for high throughput screening. Such stablecells lines may be generated by introducing a construct comprising aselectable marker, allowing the cells to grow for 1-2 days in anenriched medium, and then growing the cells on a selective medium. Theselectable marker in the recombinant plasmid confers resistance to theselection and allows cells to stably integrate the plasmid into theirchromosomes and grow to form foci which in turn can be cloned andexpanded into cell lines.

A number of selection systems may be used, including but not limited tothe herpes simplex virus thymidine kinase (Wigler, et al., 1977, Cell11:223), hypoxanthine-guanine phosphoribosyltransferase (Szybalska &Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48:2026), and adeninephosphoribosyltransferase (Lowy, et al., 1980, Cell 22:817) genes can beemployed in tk-, hgprt- or aprt-cells, respectively. Also,anti-metabolite resistance can be used as the basis of selection fordhfr, which confers resistance to methotrexate (Wigler, et al., 1980,Natl. Acad. Sci. USA 77:3567; O'Hare, et al., 1981, Proc. Natl. Acad.Sci. USA 78:1527); gpt, which confers resistance to mycophenolic acid(Mulligan & Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072); neo, whichconfers resistance to the aminoglycoside G-418 (Colberre-Garapin, etal., 1981, J. Mol. Biol. 150:1); and hygro, which confers resistance tohygromycin (Santerre, et al., 1984, Gene 30:147) genes.

5.9.1.1.1.5 Cell-Free Extracts

The invention provides for the translation of a nucleic acid sequenceencoding a protein, polypeptide or peptide (with or without a prematuretranslation stop codon) in a cell-free system. Techniques for practicingthe specific aspect of this invention will employ, unless otherwiseindicated, conventional techniques of molecular biology, microbiology,and recombinant DNA manipulation and production, which are routinelypracticed by one of skill in the art. See, e.g., Sambrook, 1989,Molecular Cloning, A Laboratory Manual, Second Edition; DNA Cloning,Volumes I and II (Glover, Ed. 1985); and Transcription and Translation(Hames & Higgins, Eds. 1984).

Any technique well-known to one of skill in the art may be used togenerate cell-free extracts for translation. For example, the cell-freeextracts can be generated by centrifuging cells and clarifying thesupernatant. In one embodiment, the cells are incubated on ice duringthe preparation of the cell-free extract. In another embodiment, thecells are incubated on ice at least 12 hours, at least 24 hours, atleast two days, at least five days, at least one week, at least longerthan one week. In a more specific embodiment, the cells are incubated onice at least long enough so as to improve the translation activity ofthe cell extract in comparison to cell extracts that are not incubatedon ice. In yet another embodiment, the cells are incubated at atemperature between about 0° C. and 10° C. In a preferred embodiment,the cells are incubated at about 4° C.

In another preferred embodiment, the cells are centrifuged at a lowspeed to isolate the cell-free extract for in vitro translationreactions. In a preferred embodiment, the cell extract is thesupernatant from cells that are centrifiged at about 2×g to 20,000×g. Ina more preferred embodiment, the cell extract is the supernatant fromcells that are centrifuged at about 5×g to 15,000×g. In an even morepreferred embodiment, the cell extract is the supernatant from cellsthat are centrifuged at about 10,000×g. Alternatively, in a preferredembodiment, the cell-free extract is about the S1 to S50 extract. In amore preferred embodiment, the cell extract is about the S5 to S25extract. In an even more preferred embodiment, the cell extract is aboutthe S10 extract.

The cell-free translation extract may be isolated from cells of anyspecies origin. In another embodiment, the cell-free translation extractis isolated from yeast, cultured mouse or rat cells, Chinese hamsterovary (CHO) cells, Xenopus oocytes, reticulocytes, wheat germ, or ryeembryo (see, e.g., Krieg & Melton, 1984, Nature 308:203 and Dignam etal., 1990 Methods Enzymol. 182:194-203). Alternatively, the cell-freetranslation extract, e.g., rabbit reticulocyte lysates and wheat germextract, can be purchased from, e.g., Promega, (Madison, Wis.). Inanother embodiment, the cell-free translation extract is prepared asdescribed in International Patent Publication No. WO 01/44516 and U.S.Pat. No. 4,668,625 to Roberts, the disclosures of which are incorporatedby reference in their entireties. In a preferred embodiment, thecell-free extract is an extract isolated from human cells. In a morepreferred embodiment, the human cells are HeLa cells. It is preferredthat the endogenous expression of the genes with the prematuretranslation codons is minimal, and preferably absent, in the cells fromwhich the cell-free translation extract is prepared.

Systems for the in vitro transcription of RNAs with the gene of interestcloned in an expression vectors using promoters such as, but not limitedto, Sp6, T3, or 17 promoters (see, e.g., expression vectors fromInvitrogen, Carlesbad, Calif.; Promega, Madison, Wis.; and Stratagene,La Jolla, Calif.), and the subsequent transcription of the gene with theappropriate polymerase are well-known to one of skill in the art (see,e.g., Contreras et al., 1982, Nucl. Acids. Res. 10:6353). In anotherembodiment, the gene encoding the premature stop codon can bePCR-amplified with the appropriate primers, with the sequence of apromoter, such as but not limited to, Sp6, T3, or T7 promoters,incorporated into the upstream primer, so that the resulting amplifiedPCR product can be in vitro transcribed with the appropriate polymerase.

Alternatively, a coupled transcription-translation system can be usedfor the expression of a gene encoding a premature stop codon in a cellfree extract, such as the TnT® Coupled Transcription/Translation System(Promega, Madison, Wis.) or the system described in U.S. Pat. No.5,895,753 to Mierendorf et al., which is incorporated by reference inits entirety.

5.9.1.1.2 Assays

Various in vitro assays can be used to identify and verify the abilityof a compound to modulate premature translation termination and/ornonsense-mediated mRNA decay. Multiple in vitro assays can be performedsimultaneously or sequentially to assess the affect of a compound onpremature translation termination and/or nonsense-mediated mRNA decay.In a preferred embodiment, the in vitro assays described herein areperformed in a high throughput format (e.g., in microtiter plates).

In a specific embodiment, the invention provides a method foridentifying a compound that modulates premature translation terminationand/or nonsense-mediated mRNA decay, said method comprising: (a)contacting a member of a library of compounds with a cell containing anucleic acid sequence comprising a reporter gene, wherein the reportergene comprises a premature stop codon; and (b) detecting the expressionof said reporter gene, wherein a compound that modulates prematuretranslation termination and/or nonsense-mediated mRNA decay isidentified if the expression of said reporter gene in the presence of acompound is altered relative to a previously determined reference range,or the expression of said reporter gene in the absence of said compoundor the presence of an appropriate control (e.g., a negative control).

In another embodiment, the invention provides a method for identifying acompound that modulates premature translation termination and/ornonsense-mediated mRNA decay, said method comprising: (a) contacting amember of a library of compounds with a cell-free extract and a nucleicacid sequence comprising a reporter gene, wherein the reporter genecomprises a premature stop codon; and (b) detecting the expression ofsaid reporter gene, wherein a compound that modulates prematuretranslation termination and/or nonsense-mediated mRNA decay isidentified if the expression of said reporter gene in the presence of acompound is altered relative to a previously determined reference range,or the expression of said reporter gene in the absence of said compoundor the presence of an appropriate control (e.g., a negative control). Inaccordance with this embodiment, the cell-extract is preferably isolatedfrom cells that have been incubated at about 0° C. to about 10° C.and/or an S10 to S30 cell-free extract.

The alteration in reporter gene expression and/or activity in thereporter gene assays relative to a previously determined referencerange, or to the expression or activity of the reporter gene in theabsence of the compound or the presence of an appropriate control (e.g.,a negative control such as phosphate buffered saline) indicates that aparticular compound modulates premature translation termination and/ornonsense-mediated mRNA decay. In particular, an increase in reportergene expression or activity relative to a previously determinedreference range, or to the expression in the absence of the compound orthe presence of an appropriate control (e.g., a negative control) may,depending upon the parameters of the reporter gene assay, indicate thata particular compound reduces or suppresses premature translationtermination and/or nonsense-mediated mRNA decay. In contrast, a decreasein reporter gene expression or activity relative to a previouslydetermined reference range, or to the expression in the absence of thecompound or the presence of an appropriate control (e.g., a negativecontrol) may, depending upon the parameters of the reporter gene-basedassay, indicate that a particular compound enhances prematuretranslation termination and/or nonsense-mediated mRNA decay.

The step of contacting a compound or a member of a library of compoundswith cell in the reporter gene-based assays described herein ispreferably conducted under physiologic conditions. In specificembodiment, a compound or a member of a library of compounds is added tothe cells in the presence of an aqueous solution. In accordance withthis embodiment, the aqueous solution may comprise a buffer and acombination of salts, preferably approximating or mimicking physiologicconditions. Alternatively, the aqueous solution may comprise a buffer, acombination of salts, and a detergent or a surfactant. Examples of saltswhich may be used in the aqueous solution include, but not limited to,KCl, NaCl, and/or MgCl₂. The optimal concentration of each salt used inthe aqueous solution is dependent on the cells and compounds used andcan be determined using routine experimentation. The step of contactinga compound or a member of a library of compounds with a cell containinga reporter gene construct and in some circumstances, a nucleic acidsequence encoding a regulatory protein, may be performed for at least0.2 hours, 0.25 hours, 0.5 hours, 1 hour, 2 hours, 3 hours, 4 hours, 5hours, 6 hours, 8 hours, 10 hours, 12 hours, 18 hours, at least 1 day,at least 2 days or at least 3 days.

The expression of a reporter gene and/or activity of the protein encodedby the reporter gene in the reporter-gene assays may be detected by anytechnique well-known to one of skill in the art. The expression of areporter gene can be readily detected, e.g. by quantifying the proteinand/or RNA encoded by said gene. Compounds that modulate prematuretranslation termination and/or nonsense-mediated mRNA decay may beidentified by changes in the gene encoding the premature translationstop codon, i.e., there is readthrough of the premature translation stopcodon and a longer gene product is detected. If a gene encoding anaturally-occurring premature translation stop codon is used, a longergene product in the presence of a compound that modulates prematuretranslation termination and/or nonsense-mediated mRNA decay can bedetected by any method in the art that permits the detection of thelonger polypeptide, such as, but not limited to, immunological methods.

Many methods standard in the art can be thus employed, including, butnot limited to, immunoassays to detect and/or visualize gene expression(e.g., Western blot, immunoprecipitation followed by sodium dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE),immunocytochemistry, radioimmunoassays, ELISA (enzyme linkedimmunosorbent assay), “sandwich” immunoassays, immunoprecipitationassays, precipitin reactions, gel diffusion precipitin reactions,immunodiffusion assays, agglutination assays, complement-fixationassays, immunoradiometric assays, fluorescent immunoassays, protein Aimmunoassays, or an epitope tag using an antibody that is specific tothe polypeptide encoded by the gene of interest) and/or hybridizationassays to detect gene expression by detecting and/or visualizingrespectively mRNA encoding a gene (e.g., Northern assays, dot blots, insitu hybridization, etc), etc. Preferably, the antibody is specific tothe C-terminal portion of the polypeptide used in an immunoassay. Suchassays are routine and well known in the art (see, e.g., Ausubel et al,eds, 1994, Current Protocols in Molecular Biology, Vol. 1, John Wiley &Sons, Inc., New York, which is incorporated by reference herein in itsentirety). Exemplary immunoassays are described briefly below (but arenot intended by way of limitation).

Immunoprecipitation protocols generally comprise lysing a population ofcells in a lysis buffer such as RIPA buffer (1% NP-40 or Triton X-100,1% sodium deoxycholate, 0.1% SDS, 0.15 M NaCl, 0.01 M sodium phosphateat pH 7.2, 1% Trasylol) supplemented with protein phosphatase and/orprotease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate),adding the antibody which recognizes the antigen to the cell lysate,incubating for a period of time (e.g., 1 to 4 hours) at 40° C., addingprotein A and/or protein G sepharose beads to the cell lysate,incubating for about an hour or more at 40° C., washing the beads inlysis buffer and resuspending the beads in SDS/sample buffer. Theability of the antibody to immunoprecipitate a particular antigen can beassessed by, e.g., western blot analysis. One of skill in the art wouldbe knowledgeable as to the parameters that can be modified to increasethe binding of the antibody to an antigen and decrease the background(e.g., pre-clearing the cell lysate with sepharose beads). For furtherdiscussion regarding immunoprecipitation protocols see, e.g., Ausubel etal, eds, 1994, Current Protocols in Molecular Biology, Vol. 1, JohnWiley & Sons, Inc., New York at 10.16.1.

Western blot analysis generally comprises preparing protein samples,electrophoresis of the protein samples in a polyacrylamide gel (e.g.,8%-20% SDS-PAGE depending on the molecular weight of the antigen),transferring the protein sample from the polyacrylamide gel to amembrane such as nitrocellulose, PVDF or nylon, blocking the membrane inblocking solution (e.g., PBS with 3% BSA or non-fat milk), washing themembrane in washing buffer (e.g., PBS-Tween 20), blocking the membranewith primary antibody (the antibody which recognizes the antigen)diluted in blocking buffer, washing the membrane in washing buffer,blocking the membrane with a secondary antibody (which recognizes theprimary antibody, e.g., an anti-human antibody) conjugated to anenzymatic substrate (e.g., horseradish peroxidase or alkalinephosphatase) or radioactive molecule (e.g., ³²P or ¹²⁵I) diluted inblocking buffer, washing the membrane in wash buffer, and detecting thepresence of the antigen. One of skill in the art would be knowledgeableas to the parameters that can be modified to increase the signaldetected and to reduce the background noise. For further discussionregarding western blot protocols see, e.g., Ausubel et al, eds, 1994,Current Protocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc.,New York at 10.8.1.

ELISAs comprise preparing antigen, coating the well of a 96 wellmicrotiter plate with the antigen, adding a primary antibody (whichrecognizes the antigen) conjugated to a detectable compound such as anenzymatic substrate (e.g., horseradish peroxidase or alkalinephosphatase) to the well and incubating for a period of time, anddetecting the presence of the antigen. In ELISAs the antibody ofinterest does not have to be conjugated to a detectable compound;instead, a second antibody (which recognizes the primary antibody)conjugated to a detectable compound may be added to the well. Further,instead of coating the well with the antigen, the antibody may be coatedto the well. In this case, a second antibody conjugated to a detectablecompound may be added following the addition of the antigen of interestto the coated well. One of skill in the art would be knowledgeable as tothe parameters that can be modified to increase the signal detected aswell as other variations of ELISAs known in the art. For furtherdiscussion regarding ELISAs see, e.g., Ausubel et al, eds, 1994, CurrentProtocols in Molecular Biology, Vol. 1, John Wiley & Sons, Inc., NewYork at 11.2.1.

Methods for detecting the activity of a protein encoded by a reportergene will vary with the reporter gene used. Assays for the variousreporter genes are well-known to one of skill in the art. For example,as described in Section 5.1.1., luciferase, beta-galactosidase(“beta-gal”), beta-glucoronidase (“GUS”), beta-lactamase,chloramphenicol acetyltransferase (“CAT”), and alkaline phosphatase(“AP”) are enzymes that can be analyzed in the presence of a substrateand could be amenable to high throughput screening. For example, thereaction products of luciferase, beta-galactosidase (“beta-gal”), andalkaline phosphatase (“AP”) are assayed by changes in light imaging(e.g., luciferase), spectrophotometric absorbance (e.g., beta-gal), orfluorescence (e.g. AP). Assays for changes in light output, absorbance,and/or fluorescence are easily adapted for high throughput screening.For example, beta-gal activity can be measured with a microplate reader.Green fluorescent protein (“GFP”) activity can be measured by changes influorescence. For example, in the case of mutant GFPs that fluoresce at488 nm, standard fluorescence activated cell sorting (“FACS”) equipmentcan be used to separate cells based upon GFP activity.

Changes in mRNA stability of the gene encoding the premature translationstop codon can be measured. As discussed above, nonsense-mediated mRNAdecay alters the stability of an mRNA with a premature translation stopcodon so that such mRNA is targeted for rapid decay instead oftranslation. In the presence of a compound that modulates prematuretranslation termination and/or nonsense-mediated mRNA decay, thestability of the mRNA with the premature translation stop codon islikely altered, i.e., stabilized. Methods of measuring changes in steadystate levels of mRNA are well-known to one of skill in the art. Suchmethods include, but are not limited to, Northern blots, dot blots,solution hybridization, RNase protection assays, and S1 nucleaseprotection assays, wherein the steady state levels of the mRNA ofinterest are measured with an appropriately labeled nucleic acid probe.Alternatively, methods such as semi-quantitative polymerase chainreaction (“PCR”) can be used to measure changes in steady state levelsof the mRNA of interest using the appropriate primers for amplification.

Alterations in the expression of a reporter gene may be determined bycomparing the level of expression and/or activity of the reporter geneto a negative control (e.g., PBS or another agent that is known to haveno effect on the expression of the reporter gene) and optionally, apositive control (e.g., an agent that is known to have an effect on theexpression of the reporter gene, preferably an agent that effectspremature translation termination and/or nonsense-mediated mRNA decay).Alternatively, alterations in the expression and/or activity of areporter gene may be determined by comparing the level of expressionand/or activity of the reporter gene to a previously determinedreference range.

5.9.1.2 Other In Vitro Assays

Where the gene product of interest is involved in cell growth orviability, the in vivo effect of the lead compound can be assayed bymeasuring the cell growth or viability of the target cell. Such assayscan be carried out with representative cells of cell types involved in aparticular disease or disorder (e.g., leukocytes such as T cells, Bcells, natural killer cells, macrophages, neutrophils and eosinophils).A lower level of proliferation or survival of the contacted cellsindicates that the lead compound is effective to treat a condition inthe patient characterized by uncontrolled cell growth. Alternatively,instead of culturing cells from a patient, a lead compound may bescreened using cells of a tumor or malignant cell line or an endothelialcell line. Specific examples of cell culture models include, but are notlimited to, for lung cancer, primary rat lung tumor cells (see, e.g.,Swafford et al., 1997, Mol. Cell. Biol., 17:1366-1374) and large-cellundifferentiated cancer cell lines (see, e.g., Mabry et al., 1991,Cancer Cells, 3:53-58); colorectal cell lines for colon cancer (see,e.g., Park & Gazdar, 1996, J. Cell Biochem. Suppl. 24:131-141); multipleestablished cell lines for breast cancer (see, e.g., Hambly et al.,1997, Breast Cancer Res. Treat. 43:247-258; Gierthy et al., 1997,Chemosphere 34:1495-1505; and Prasad & Church, 1997, Biochem. Biophys.Res. Commun. 232:14-19); a number of well-characterized cell models forprostate cancer (see, e.g., Webber et al., 1996, Prostate, Part 1,29:386-394; Part 2, 30:58-64; and Part 3, 30:136-142 and Boulikas, 1997,Anticancer Res. 17:1471-1505); for genitourinary cancers, continuoushuman bladder cancer cell lines (see, e.g., Ribeiro et al., 1997, Int.J. Radiat. Biol. 72:11-20); organ cultures of transitional cellcarcinomas (see, e.g., Booth et al., 1997, Lab Invest. 76:843-857) andrat progression models (see, e.g., Vet et al., 1997, Biochim. BiophysActa 1360:39-44); and established cell lines for leukemias and lymphomas(see, e.g., Drexler, 1994, Leuk. Res. 18:919-927 and Tohyama, 1997, Int.J. Hematol. 65:309-317).

Many assays well-known in the art can be used to assess the survivaland/or growth of a patient cell or cell line following exposure to alead compound; for example, cell proliferation can be assayed bymeasuring bromodeoxyuridine (BrdU) incorporation (see, e.g., Hoshino etal., 1986, Int. J. Cancer 38:369 and Campana et al., 1988, J. Immunol.Meth. 107:79) or (3H)-thymidine incorporation (see, e.g., Chen, 1996,Oncogene 13:1395-403 and Jeoung, 1995, J. Biol. Chem. 270:18367-73), bydirect cell count, by detecting changes in transcription, translation oractivity of known genes such as proto-oncogenes (e.g., fos, myc) or cellcycle markers (Rb, cdc2, cyclin A, D1, D2, D3, E, etc.). The levels ofsuch protein and mRNA and activity can be determined by any method wellknown in the art. For example, protein can be quantitated by knownimmunodiagnostic methods such as western blotting or immunoprecipitationusing commercially available antibodies. mRNA can be quantitated usingmethods that are well known and routine in the art, for example, usingnorthern analysis, RNase protection, the polymerase chain reaction inconnection with reverse transcription (“RT-PCR”). Cell viability can beassessed by using trypan-blue staining or other cell death or viabilitymarkers known in the art. In a specific embodiment, the level ofcellular ATP is measured to determined cell viability. Differentiationcan be assessed, for example, visually based on changes in morphology.

The lead compound can also be assessed for its ability to inhibit celltransformation (or progression to malignant phenotype) in vitro. In thisembodiment, cells with a transformed cell phenotype are contacted with alead compound, and examined for change in characteristics associatedwith a transformed phenotype (a set of in vitro characteristicsassociated with a tumorigenic ability in vivo), for example, but notlimited to, colony formation in soft agar, a more rounded cellmorphology, looser substratum attachment, loss of contact inhibition,loss of anchorage dependence, release of proteases such as plasminogenactivator, increased sugar transport, decreased serum requirement, orexpression of fetal antigens, etc. (see, e.g., Luria et al., 1978,General Virology, 3d Ed., John Wiley & Sons, New York, pp. 436-446).

Loss of invasiveness or decreased adhesion can also be assessed todemonstrate the anti-cancer effects of a lead compound. For example, anaspect of the formation of a metastatic cancer is the ability of aprecancerous or cancerous cell to detach from primary site of diseaseand establish a novel colony of growth at a secondary site. The abilityof a cell to invade peripheral sites reflects its potential for acancerous state. Loss of invasiveness can be measured by a variety oftechniques known in the art including, for example, induction ofE-cadherin-mediated cell-cell adhesion. Such E-cadherin-mediatedadhesion can result in phenotypic reversion and loss of invasiveness(see, e.g., Hordijk et al., 1997, Science 278:1464-66).

Loss of invasiveness can further be examined by inhibition of cellmigration. A variety of 2-dimensional and 3-dimensional cellularmatrices are commercially available (Calbiochem-Novabiochem Corp. SanDiego, Calif.). Cell migration across or into a matrix can be examinedusing microscopy, time-lapsed photography or videography, or by anymethod in the art allowing measurement of cellular migration. In arelated embodiment, loss of invasiveness is examined by response tohepatocyte growth factor (“HGF”). HGF-induced cell scattering iscorrelated with invasiveness of cells such as Madin-Darby canine kidney(“MDCK”) cells. This assay identifies a cell population that has lostcell scattering activity in response to HGF (see, e.g., Hordijk et al.,1997, Science 278:1464-66).

Alternatively, loss of invasiveness can be measured by cell migrationthrough a chemotaxis chamber (Neuroprobe/Precision Biochemicals Inc.Vancouver, BC). In such assay, a chemo-attractant agent is incubated onone side of the chamber (e.g., the bottom chamber) and cells are platedon a filter separating the opposite side (e.g., the top chamber). Inorder for cells to pass from the top chamber to the bottom chamber, thecells must actively migrate through small pores in the filter.Checkerboard analysis of the number of cells that have migrated can thenbe correlated with invasiveness (see e.g., Ohnishi, 1993, Biochem.Biophys. Res. Commun. 193:518-25).

A lead compound can also be assessed for its ability to alter theexpression of a secondary protein (as determined, e.g. by western blotanalysis) or RNA, whose expression and/or activation is regulateddirectly or indirectly by the gene product of a gene of interestcontaining a premature stop codon or a nonsense mutation (as determined,e.g., by RT-PCR or northern blot analysis) in cultured cells in vitrousing methods which are well known in the art. Further, chemicalfootprinting analysis can be conducted and is well-known in the art.

In another embodiment of the invention, the lead compound can be testedin a host cell. In such an embodiment, the host cell can enode a nucleicacid with a premature translation termination sequence or stop codon.Such nucleic acids can be encoded by a number of vehicles, including,but not limited to, recombinant or chimeric vectors, viruses or thegenome of the host cell. In another embodiment of the invention, thepresence of the gene, containing a premature stop codon or translationtermination sequence, causes a detectable phenotype in the host cell.Moreover, the effect of lead compounds on the phenotype of the cell canbe examined in order to determine suitable candidates that modulatepremature translation termination from a pool of compounds. In oneembodiment, a host cell containing a gene encoding a prematuretranslation termination sequence or stop codon, exhibits an abnormalphenotype relative to the wild type cell that does not encode a genewith a premature stop codon. In such an embodiment, the effect of acompound on the host cell phenotype can be examined in order todetermine the effect of a lead compound on premature translationtermination or nonsense mediate mRNA decay. By way of example and notmeant to limit the possible models, host cells, expressing mutations ina gene that controls cell cycle or proliferation, e.g., p53, can beexposed to various lead compounds in order to determine their effect oncell proliferation. Any lead compound that affects the proliferativeactivity of the host cell is identified as a compound that modulatespremature translation termination or nonsense mediated mRNA decay.

5.9.2 Animal Models

Animal model systems can be used to demonstrate the safety and efficacyof the lead compounds identified in the nonsense suppression assaysdescribed above. The lead compounds identified in the nonsensesuppression assay can then be tested for biological activity usinganimal models for a disease, condition, or syndrome of interest. Theseinclude animals engineered to contain the target RNA element coupled toa functional readout system, such as a transgenic mouse.

There are a number of methods that can be used to conduct animal modelstudies. Briefly, a compound identified in accordance with the methodsof the invention is introduced into an animal model so that the effectof the compound on the manifestation of disease can be determined. Theprevention or reduction in the severity, duration or onset of a symptomassociated with the disease or disorder of the animal model that isassociated with, characterized by or caused by premature translationtermination and/or nonsense mediated mRNA decay would indicate that thecompound adminstered to the animal model had a prophylactic ortherapeutic effect. Any method can be used to introduce the compoundinto the animal model, including, but not limited to, injection,intravenous infusion, oral ingestion, or inhalation. In a preferredembodiment, transgenic hosts are constructed so that the animal genomeencodes a gene of interest with a premature translation terminationsequence or stop codon. In such an embodiment, the gene, containg apremature translation termination sequence or stop codon, would notencode a full length peptide from a transcribed mRNA. The adminsitrationof a compound to the animal model, and the expression of a full lengthprotein, polypeptide or peptide, for example, corresponding to the genecontaining a premature stop codon would indicate that the compoundmodulates premature translation termination. Any method known in theart, or described herein, can be used to determine if the stop codon wasmodulated by the compound. In another embodiment, the animal host genomeencodes a native gene containing a premature stop codon. In anotherembodiment of the invention, the animal host is a natural mutant, i.e.,natively encoding a gene with a premature stop codon. For example, theanimal can be a model for cystic fibrosis wherein the animal genomecontains a natural mutation that incorporates a premature stop codon ortranslation termination sequence.

Examples of animal models for cystic fibrosis include, but are notlimited to, cftr(−/−) mice (see, e.g., Freedman et al., 2001,Gastroenterology 121(4):950-7), cftr(tm1HGU/tm1HGU) mice (see, e.g.,Bernhard et al., 2001, Exp Lung Res 27(4):349-66), CFTR-deficient micewith defective cAMP-mediated C1(−) conductance (see, e.g., Stotland etal., 2000, Pediatr Pulmonol 30(5):413-24), andC57BL/6-Cftr(m1UNC)/Cftr(m1UNC) knockout mice (see, e.g., Stotland etal., 2000, Pediatr Pulmonol 30(5):413-24).

Examples of animal models for muscular dystrophy include, but are notlimited to, mouse, hamster, cat, dog, and C. elegans. Examples of mousemodels for muscular dystrophy include, but are not limited to, the dy−/−mouse (see, e.g., Connolly et al., 2002, J Neuroimmunol 127(1-2):80-7),a muscular dystrophy with myositis (mdm) mouse mutation (see, e.g.,Garvey et al., 2002, Genomics 79(2):146-9), the mdx mouse (see, e.g.,Nakamura et al., 2001, Neuromuscul Disord 11(3):251-9), theutrophin-dystrophin knockout (dko) mouse (see, e.g., Nakamura et al.,2001, Neuromuscul Disord 11(3):251-9), the dy/dy mouse (see, e.g.,Dubowitz et al., 2000, Neuromuscul Disord 10(4-5):292-8), the mdx(Cv3)mouse model (see, e.g., Pillers et al., 1999, Laryngoscope109(8):1310-2), and the myotonic ADR-MDX mutant mice (see, e.g. Krameret al., 1998, Neuromuscul Disord 8(8):542-50). Examples of hamstermodels for muscular dystrophy include, but are not limited to,sarcoglycan-deficient hamsters (see, e.g., Nakamura et al., 2001, Am JPhysiol Cell Physiol 281(2):C690-9) and the BIO 14.6 dystrophic hamster(see, e.g., Schlenker & Burbach, 1991, J Appl Physiol 71(5):1655-62). Anexample of a feline model for muscular dystrophy includes, but is notlimited to, the hypertrophic feline muscular dystrophy model (see, e.g.,Gaschen & Burgunder, 2001, Acta Neuropathol (Berl) 101(6):591-600).Canine models for muscular dystrophy include, but are not limited to,golden retriever muscular dystrophy (see, e.g., Fletcher et al., 2001,Neuromuscul Disord 11(3):239-43) and canine X-linked muscular dystrophy(see, e.g., Valentine et al., 1992, Am J Med Genet 42(3):352-6).Examples of C. elegans models for muscular dystrophy are described inChamberlain & Benian, 2000, Curr Biol 10(21):R795-7 and Culette &Sattelle, 2000, Hum Mol Genet 9(6):869-77.

Examples of animal models for familial hypercholesterolemia include, butare not limited to, mice lacking functional LDL receptor genes (see,e.g., Aji et al., 1997, Circulation 95(2):430-7), Yoshida rats (see,e.g., Fantappie et al., 1992, Life Sci 50(24):1913-24), the JCR:LA-cprat (see, e.g., Richardson et al., 1998, Atherosclerosis 138(1):135-46),swine (see, e.g., Hasler-Rapacz et al., 1998, Am J Med Genet76(5):379-86), and the Watanabe heritable hyperlipidaemic rabbit (see,e.g., Tsutsumi et al., 2000, Arzneimittelforschung 50(2):118-21; Harschet al., 1998, Br J Pharmacol 124(2):227-82; and Tanaka et al., 1995,Atherosclerosis 114(1):73-82).

An example of an animal model for human cancer in general includes, butis not limited to, spontaneously occurring tumors of companion animals(see, e.g., Vail & MacEwen, 2000, Cancer Invest 18(8):781-92). Examplesof animal models for lung cancer include, but are not limited to, lungcancer animal models described by Zhang & Roth (1994, In Vivo8(5):755-69) and a transgenic mouse model with disrupted p53 function(see, e.g., Morris et al., 1998, J La State Med Soc 150(4):179-85). Anexample of an animal model for breast cancer includes, but is notlimited to, a transgenic mouse that overexpresses cyclin D1 (see, e.g.,Hosokawa et al., 2001, Transgenic Res 10(5):471-8). An example of ananimal model for colon cancer includes, but is not limited to, a TCRbetaand p53 double knockout mouse (see, e.g., Kado et al., 2001, Cancer Res61(6):2395-8). Examples of animal models for pancreatic cancer include,but are not limited to, a metastatic model of Panc02 murine pancreaticadenocarcinoma (see, e.g., Wang et al., 2001, Int J Pancreatol29(1):37-46) and nu-nu mice generated in subcutaneous pancreatic tumours(see, e.g., Ghaneh et al., 2001, Gene Ther 8(3):199-208). Examples ofanimal models for non-Hodgkin's lymphoma include, but are not limitedto, a severe combined immunodeficiency (“SCID”) mouse (see, e.g., Bryantet al., 2000, Lab Invest 80(4):553-73) and an IgHmu-HOX11 transgenicmouse (see, e.g., Hough et al., 1998, Proc Natl Acad Sci USA95(23):13853-8). An example of an animal model for esophageal cancerincludes, but is not limited to, a mouse transgenic for the humanpapillomavirus type 16 E7 oncogene (see, e.g., Herber et al., 1996, JVirol 70(3):1873-81). Examples of animal models for colorectalcarcinomas include, but are not limited to, Apc mouse models (see, e.g.Fodde & Smits, 2001, Trends Mol Med 7(8):369-73 and Kuraguchi et al.,2000, Oncogene 19(50):5755-63). An example of an animal model forneurofibromatosis includes, but is not limited to, mutant NF1 mice (see,e.g., Cichowski et al., 1996, Semin Cancer Biol 7(5):291-8). Examples ofanimal models for retinoblastoma include, but are not limited to,transgenic mice that expression the simian virus 40 T antigen in theretina (see, e.g., Howes et al., 1994, Invest Ophthalmol Vis Sci35(2):342-51 and Windle et al, 1990, Nature 343(6259):665-9) and inbredrats (see, e.g., Nishida et al., 1981, Curr Eye Res 1(1):53-5 andKobayashi et al., 1982, Acta Neuropathol (Berl) 57(2-3):203-8). Examplesof animal models for Wilm's tumor include, but are not limited to, a WT1knockout mice (see, e.g., Scharnhorst et al., 1997, Cell Growth Differ8(2):133-43), a rat subline with a high incidence of neuphroblastoma(see, e.g., Mesfin & Breech, 1996, Lab Anim Sci 46(3):321-6), and aWistar/Furth rat with Wilms' tumor (see, e.g., Murphy et al., 1987,Anticancer Res 7(4B):717-9).

Examples of animal models for retinitis pigmentosa include, but are notlimited to, the Royal College of Surgeons (“RCS”) rat (see, e.g.,Vollrath et al., 2001, Proc Natl Acad Sci USA 98(22);12584-9 andHanitzsch et al., 1998, Acta Anat (Basel) 162(2-3):119-26), a rhodopsinknockout mouse (see, e.g., Jaissle et al., 2001, Invest Ophthalmol VisSci 42(2):506-13), and Wag/Rij rats (see, e.g., Lai et al., 1980, Am JPathol 98(1):281-4).

Examples of animal models for cirrhosis include, but are not limited to,CCl₄-exposed rats (see, e.g., Kloehn et al., 2001, Horm Metab Res33(7):394-401) and rodent models instigated by bacterial cell componentsor colitis (see, e.g., Vierling, 2001, Best Pract Res Clin Gastroenterol15(4):591-610).

Examples of animal models for hemophilia include, but are not limitedto, rodent models for hemophilia A (see, e.g., Reipert et al., 2000,Thromb Haemost 84(5):826-32; Jarvis et al., 1996, Thromb Haemost75(2):318-25; and Bi et al., 1995, Nat Genet 10(1):119-21), caninemodels for hemophilia A (see, e.g., Gallo-Penn et al., 1999, Hum GeneTher 10(11):1791-802 and Connelly et al, 1998, Blood 91(9);3273-81),murine models for hemophilia B (see, e.g., Snyder et al., 1999, Nat Med5(1):64-70; Wang et al., 1997, Proc Natl Acad Sci USA 94(21):11563-6;and Fang et al., 1996, Gene Ther 3(3):217-22), canine models forhemophilia B (see, e.g., Mount et al., 2002, Blood 99(8):2670-6; Snyderet al., 1999, Nat Med 5(1):64-70; Fang et al., 1996, Gene Ther3(3):217-22); and Kay et al., 1994, Proc Natl Acad Sci USA91(6):2353-7), and a rhesus macaque model for hemophilia B (see, e.g.,Lozier et al., 1999, Blood 93(6):1875-81).

Examples of animal models for von Willebrand disease include, but arenot limited to, an inbred mouse strain RIIIS/J (see, e.g., Nichols etal., 1994, 83(11):3225-31 and Sweeney et al., 1990, 76(11):2258-65),rats injected with botrocetin (see, e.g., Sanders et al., 1988, LabInvest 59(4):443-52), and porcine models for von Willebrand disease(see, e.g., Nichols et al., 1995, Proc Natl Acad Sci USA 92(7):2455-9;Johnson & Bowie, 1992, J Lab Clin Med 120(4):553-8); and Brinkhous etal., 1991, Mayo Clin Proc 66(7):733-42).

Examples of animal models for β-thalassemia include, but are not limitedto, murine models with mutations in globin genes (see, e.g., Lewis etal., 1998, Blood 91(6):2152-6; Raja et al., 1994, Br J Haematol86(1):156-62; Popp et al., 1985, 445:432-44; and Skow et al., 1983, Cell34(3):1043-52).

Examples of animal models for kidney stones include, but are not limitedto, genetic hypercalciuric rats (see, e.g., Bushinsky et al., 1999,Kidney Int 55(1):234-43 and Bushinsky et al., 1995, Kidney Int48(6):1705-13), chemically treated rats (see, e.g., Grases et al., 1998,Scand J Urol Nephrol 32(4):261-5; Burgess et al., 1995, Urol Res23(4):239-42; Kumar et al., 1991, J Urol 146(5):1384-9; Okada et al.,1985, Hinyokika Kiyo 31(4):565-77; and Bluestone et al., 1975, LabInvest 33(3):273-9), hyperoxaluric rats (see, e.g., Jones et al., 1991,J Urol 145(4):868-74), pigs with unilateral retrograde flexiblenephroscopy (see, e.g., Seifmah et al., 2001, 57(4):832-6), and rabbitswith an obstructed upper urinary tract (see, e.g., Itatani et al., 1979,Invest Urol 17(3):234-40).

Examples of animal models for ataxia-telangiectasia include, but are notlimited to, murine models of ataxia-telangiectasia (see, e.g., Barlow etal., 1999, Proc Natl Acad Sci USA 96(17):9915-9 and Inoue et al., 1986,Cancer Res 46(8):3979-82). A mouse model was generated forataxia-telangiectasia using gene targeting to generate mice that did notexpress the Atm protein (see, e.g., Elson et al., 1996, Proc. Nat. Acad.Sci. 93: 13084-13089).

Examples of animal models for lysosomal storage diseases include, butare not limited to, mouse models for mucopolysaccharidosis type VII(see, e.g., Brooks et al., 2002, Proc Natl Acad Sci USA. 99(9):6216-21;Monroy et al., 2002, Bone 30(2):352-9; Vogler et al., 2001, Pediatr DevPathol. 4(5):421-33; Vogler et al., 2001, Pediatr Res. 49(3):342-8; andWolfe et al., 2000, Mol Ther. 2(6):552-6), a mouse model formetachromatic leukodystrophy (see, e.g., Matzner et al., 2002, GeneTher. 9(1):53-63), a mouse model of Sandhoff disease (see, e.g., Sangoet al., 2002, Neuropathol Appl Neurobiol. 28(1):23-34), mouse models formucopolysaccharidosis type III A (see, e.g., Bhattacharyya et al., 2001,Glycobiology 11(1):99-10 and Bhaumik et al., 1999, Glycobiology9(12):1389-96.), arylsulfatase A (ASA)-deficient mice (see, e.g.,D'Hooge et al., 1999, Brain Res. 847(2):352-6 and D'Hooge et al, 1999,Neurosci Lett. 273(2):93-6); mice with an aspartylglucosaminuriamutation (see, e.g., Jalanko et al., 1998, Hum Mol Genet. 7(2):265-72);feline models of mucopolysaccharidosis type VI (see, e.g. Crawley etal., 1998, J Clin Invest. 101(1):109-19 and Norrdin et al., 1995, Bone17(5):485-9); a feline model of Niemann-Pick disease type C (see, e.g.,March et al., 1997, Acta Neuropathol (Berl). 94(2):164-72); acidsphingomyelinase-deficient mice (see, e.g., Otterbach & Stoffel, 1995,Cell 81(7):1053-6), and bovine mannosidosis (see, e.g., Jolly et al.,1975, Birth Defects Orig Artic Ser. 11(6):273-8).

Examples of animal models for tuberous sclerosis (“TSC”) include, butare not limited to, a mouse model of TSC1 (see, e.g., Kwiatkowski etal., 2002, Hum Mol Genet. 11(5):525-34), a Tsc1 (TSC1 homologue)knockout mouse (see, e.g. Kobayashi et al., 2001, Proc Natl Acad SciUSA. Jul. 17, 2001;98(15):8762-7), a TSC2 gene mutant(Eker) rat model(see, e.g., Hino 2000, Nippon Rinsho 58(6):1255-61; Mizuguchi et al.,2000, J Neuropathol Exp Neurol. 59(3):188-9; and Hino et al., 1999, ProgExp Tumor Res. 35:95-108); and Tsc2(+/−) mice (see, e.g., Onda et al.,1999, J Clin Invest. 104(6):687-95).

5.9.3 Toxicity

The toxicity and/or efficacy of a compound identified in accordance withthe invention can be determined by standard pharmaceutical procedures incell cultures or experimental animals, e.g., for determining the LD₅₀(the dose lethal to 50% of the population) and the ED₅₀ (the dosetherapeutically effective in 50% of the population). Cells and celllines that can be used to assess the cytotoxicity of a compoundidentified in accordance with the invention include, but are not limitedto, peripheral blood mononuclear cells (PBMCs), Caco-2 cells, and Huh7cells. The dose ratio between toxic and therapeutic effects is thetherapeutic index and it can be expressed as the ratio LD₅₀/ED₅₀. Acompound identified in accordance with the invention that exhibits largetherapeutic indices is preferred. While a compound identified inaccordance with the invention that exhibits toxic side effects may beused, care should be taken to design a delivery system that targets suchagents to the site of affected tissue in order to minimize potentialdamage to uninfected cells and, thereby, reduce side effects.

The data obtained from the cell culture assays and animal studies can beused in formulating a range of dosage of a compound identified inaccordance with the invention for use in humans. The dosage of suchagents lies preferably within a range of circulating concentrations thatinclude the ED₅₀ with little or no toxicity. The dosage may vary withinthis range depending upon the dosage form employed and the route ofadministration utilized. For any agent used in the method of theinvention, the therapeutically effective dose can be estimated initiallyfrom cell culture assays. A dose may be formulated in animal models toachieve a circulating plasma concentration range that includes the IC₅₀(i.e., the concentration of the compound that achieves a half-maximalinhibition of symptoms) as determined in cell culture. Such informationcan be used to more accurately determine useful doses in humans. Levelsin plasma may be measured, for example, by high performance liquidchromatography.

5.10 Design of Congeners or Analogs

The compounds which display the desired biological activity can be usedas lead compounds for the development or design of congeners or analogshaving useful pharmacological activity. For example, once a leadcompound is identified, molecular modeling techniques can be used todesign variants of the compound that can be more effective. Examples ofmolecular modeling systems are the CHARM and QUANTA programs (PolygenCorporation, Waltham, Mass.). CHARM performs the energy minimization andmolecular dynamics functions. QUANTA performs the construction, graphicmodelling and analysis of molecular structure. QUANTA allows interactiveconstruction, modification, visualization, and analysis of the behaviorof molecules with each other.

A number of articles review computer modeling of drugs interactive withspecific proteins, such as Rotivinen et al., 1988, Acta PharmaceuticalFennica 97:159-166; Ripka, 1998, New Scientist 54-57; McKinaly &Rossmann, 1989, Annu. Rev. Pharmacol. Toxiciol. 29:111-122; Perry &Davies, OSAR: Quantitative Structure-Activity Relationships in DrugDesign pp. 189-193 (Alan R. Liss, Inc. 1989); Lewis & Dean, 1989, Proc.R. Soc. Lond. 236:125-140 and 141-162; Askew et al., 1989, J. Am. Chem.Soc. 111:1082-1090. Other computer programs that screen and graphicallydepict chemicals are available from companies such as BioDesign, Inc.(Pasadena, Calif.), Allelix, Inc. (Mississauga, Ontario, Canada), andHypercube, Inc. (Cambridge, Ontario). Although these are primarilydesigned for application to drugs specific to particular proteins, theycan be adapted to design of drugs specific to any identified region. Theanalogs and congeners can be tested for binding to translationalmachinery using assays well-known in the art or described herein forbiologic activity. Alternatively, lead compounds with little or nobiologic activity, as ascertained in the screen, can also be used todesign analogs and congeners of the compound that have biologicactivity.

5.11 Uses of Compounds to Prevent/Treat a Disorder

The present invention provides methods of preventing, treating, managingor ameliorating a disorder or one or more symptoms thereof, said methodscomprising administering to a subject in need thereof one or morecompounds identified in accordance with the methods of the invention ora pharmaceutically acceptable salt thereof. In particular, the presentinvention provides methods of preventing, treating, managing orameliorating a disorder associated with premature translationtermination and/or nonsense-mediated mRNA decay, or one or more symptomsthereof, said methods comprising administering to a subject in needthereof one or more compounds identified in accordance with the methodsof the invention or a pharmaceutically acceptable salt thereof. Examplesof diseases associated with, characterized by or caused by associatedwith premature translation termination and/or nonsense-mediated mRNAdecay include, but are not limited to, cystic fibrosis, musculardystrophy, heart disease, lung cancer, breast cancer, colon cancer,pancreatic cancer, non-Hodgkin's lymphoma, ovarian cancer, esophagealcancer, colorectal carcinomas, neurofibromatosis, retinoblastoma, Wilm'stumor, retinitis pigmentosa, collagen disorders, cirrhosis, Tay-Sachsdisease, blood disorders, kidney stones, ataxia-telangiectasia,lysosomal storage diseases, and tuberous sclerosis. See Sections 5.8 and6.5 for additional non-limiting examples of diseases and geneticdisorders which can be prevented, treated, managed or ameliorated byadministering one or more of the compounds identified in accordance withthe methods of the invention or a pharmaceutically acceptable saltthereof. Genes that contain one or more nonsense mutations that arepotentially involved in causing disease are presented in table formaccording to chromosome location in Example 6.5 infra.

In a preferred embodiment, it is first determined that the patient issuffering from a disease associated with premature translationtermination and/or nonsense-mediated mRNA decay before administering acompound identified in accordance with the invention or a combinationtherapy described herein. In a preferred embodiment, the DNA of thepatient can be sequenced or subject to Southern Blot, polymerase chainreaction (PCR), use of the Short Tandem Repeat (STR), or polymorphiclength restriction fragments (RFLP) analysis to determine if a nonsensemutation is present in the DNA of the patient. Alternatively, it can bedetermined if altered levels of the protein with the nonsense mutationare expressed in the patient by western blot or other immunoassays. Suchmethods are well known to one of skill in the art.

In one embodiment, the invention provides a method of preventing,treating, managing or ameliorating a disorder or one or more symptomsthereof, said method comprising administering to a subject in needthereof a dose of a prophylactically or therapeutically effective amountof one or more compounds identified in accordance with the methods ofthe invention. In another embodiment, a compound identified inaccordance with the methods of the invention is not administered toprevent, treat, or ameliorate a disorder or one or more symptomsthereof, if such compound has been used previously to prevent, treat,manage or ameliorate said disorder. In a more specific embodiment of theinvention, disorders that can be prevented, managed and/or treated withthe compounds of the invention, include, but are not limited to,disorders that are associated with, characterized by or caused bypremature translation termination and/or nonsense mediated mRNA decay.

The invention provides methods of preventing, treating, managing orameliorating a disorder or one or more symptoms thereof, said methodscomprising administering to a subject in need thereof one or more of thecompounds identified utilizing the screening methods described herein ora pharmaceutically acceptable salt thereof and one or more othertherapies (e.g., prophylactic or therapeutic agents). In particular, theinvention provides methods of preventing, treating, managing orameliorating a disorder associated with, characterized by or caused bypremature translation termination and/or nonsense mediate mRNA decay, orone or more symptoms thereof, said methods comprising administering to asubject in need thereof one or more of the compounds identifiedutilizing the screening methods described herein or a pharmaceuticallyacceptable salt thereof, and one or more other therapies (e.g.,prophylactic or therapeutic agents). Preferably, the other therapies arecurrently being used, have been used or are known to be useful in theprevention, treatment, management or amelioration of said disorder or asymptom thereof. Non-limiting examples of such therapies are in Section5.11.1 infra.

The therapies (e.g., prophylactic or therapeutic agents) or thecombination therapies of the invention can be administered sequentiallyor concurrently. In a specific embodiment, the combination therapies ofthe invention comprise a compound identified in accordance with theinvention and at least one other therapy that has the same mechanism ofaction as said compound. In another specific embodiment, the combinationtherapies of the invention comprise a compound identified in accordancewith the methods of the invention and at least one other therapy (e.g.prophylactic or therapeutic agent) which has a different mechanism ofaction than said compound. The combination therapies of the presentinvention improve the prophylactic or therapeutic effect of a compoundof the invention by functioning together with the compound to have anadditive or synergistic effect. The combination therapies of the presentinvention reduce the side effects associated with the therapies (e.g.,prophylactic or therapeutic agents).

The prophylactic or therapeutic agents of the combination therapies canbe administered to a subject in the same pharmaceutical composition.Alternatively, the prophylactic or therapeutic agents of the combinationtherapies can be administered concurrently to a subject in separatepharmaceutical compositions. The prophylactic or therapeutic agents maybe administered to a subject by the same or different routes ofadministration.

In a specific embodiment, a pharmaceutical composition comprising one ormore compounds identified in a screening assay described herein isadministered to a subject, preferably a human, to prevent, treat, manageor ameliorate a disorder (in particular, a disorder associated with,characterized by or caused by premature translation termination and/ornonsense mediated mRNA decay) or one or more symptoms thereof. Inaccordance with the invention, the pharmaceutical composition may alsocomprise one or more other prophylactic or therapeutic agents.Preferably, such prophylactic or theapeutic agents are currently beingused, have been used or are known to be useful in the prevention,treatment, management or amelioration of a disorder (in particular, adisorder associated with, characterized by, or caused by prematuretranslation termination or nonsense-mediated mRNA decay) or one or moresymptoms thereof.

A compound identified in accordance with the methods of the inventionmay be used as a first, second, third, fourth or fifth line of therapyfor a disorder (in particular, a disorder associated with, characterizedby or caused by premature translation termination and/ornonsense-mediated mRNA decay). The invention provides methods fortreating, managing or ameliorating a disorder (in particular, a disorderassociated with, characterized by or caused by premature translationtermination and/or nonsense-mediated mRNA decay) or one or more symptomsthereof in a subject refractory to conventional therapies for suchdisorder, said methods comprising administering to said subject a doseof a prophylactically or therapeutically effective amount of a compoundidentified in accordance with the methods of the invention. Inparticular, a disorder may be determined to be refractory to a therapywhen at least some significant portion of the disorder is not resolvedin response to the therapy. Such a determination can be made either invivo or in vitro by any method known in the art for assaying theeffectiveness of a therapy on a subject, using the art-accepted meaningsof “refractory” in such a context. In a specific embodiment, a disorderis refractory where the number of symptoms of the disorder has not beensignificantly reduced, or has increased.

The invention provides methods for treating, managing or amelioratingone or more symptoms of a disorder (in particular, a disorder associatedwith, characterized by or caused by premature translation terminationand/or nonsense-mediated mRNA decay) in a subject refractory to existingsingle agent therapies for such disorder, said methods comprisingadministering to said subject a dose of a prophylactically ortherapeutically effective amount of a compound identified in accordancewith the methods of the invention and a dose of a prophylactically ortherapeutically effective amount of one or more other therapies (e.g.,prophylactic or therapeutic agents). The invention also provides methodsfor treating or managing a disorder (in particular, a disorderassociated with, characterized by or caused by premature translationtermination and/or nonsense-mediated mRNA decay) by administering acompound identified in accordance with the methods of the invention incombination with any other therapy (e.g., radiation therapy,chemotherapy or surgery) to patients who have proven refractory to othertherapies but are no longer on these therapies. The invention alsoprovides methods for the treatment or management of a patient havingdisorder (in particular, a disorder associated with, characterized by orcaused by premature translation termination and/or nonsense-mediatedmRNA decay) and said patient is immunosuppressed or immunocompromised byreason of having previously undergone other therapies. Further, theinvention provides methods for preventing the recurrence of a disorder(in particular, a disorder associated with, characterized by or causedby premature translation termination and/or nonsense-mediated mRNAdecay) such as, e.g., cancer in patients that have been undergonetherapy and have no disease activity by administering a compoundidentified in accordance with the methods of the invention.

In addition to the use of the compounds identified in accordance withthe invention for the prevention, treatment, management or ameliorationof a disorder or a symptom thereof, the compounds may be used in vitroto modulate the expression of particular genes of interest, for example,the compounds may be used to increase or decrease the expression of aparticular gene of interest when conducting in vitro studies.

5.11.1 Other Therapies

The present invention provides methods of preventing, treating, managingor ameliorating a disorder (in particular, a disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay), or one or more symptoms thereof, saidmethods comprising administering to a subject in need thereof one ormore compounds identified in accordance with the methods of theinvention or a pharmaceutically acceptable salt thereof, and one or moreother therapies (e.g., prophylactic or therapeutic agents). Any therapy(e.g., chemotherapies, radiation therapies, hormonal therapies, and/orbiological therapies/immunotherapies) which is known to be useful, orwhich has been used or is currently being used for the prevention,treatment, management or amelioration of a disorder (in particular, adisorder associated with, characterized by or caused by prematuretranslation termination and/or nonsense-mediated mRNA decay) or one ormore symptoms thereof can be used in combination with a compoundidentified in accordance with the methods of the invention. Examples oftherapeutic or prophylactic agents which can be used in combination witha compound identified in accordance with the invention include, but arenot limited to, peptides, polypeptides, fusion proteins, nucleic acidmolecules, small molecules, mimetic agents, synthetic drugs, inorganicmolecules, and organic molecules.

Proliferative disorders associated with, characterized by or caused bypremature translation termination and/or nonsense-mediated mRNA decaycan be prevented, treated, managed or ameliorated by administering to asubject in need thereof one or more of the compounds identified inaccordance with the methods of the invention, and one or more othertherapies for prevention, treatment, management or amelioration of saiddisorders or a symptom thereof. Examples of such therapies include, butare not limited to, angiogenesis inhibitors, topoisomerase inhibitors,immunomodulatory agents (such as chemotherapeutic agents) and radiationtherapy. Angiogenesis inhibitors (i.e., antiangiogenic agents) include,but are not limited to, angiostatin (plasminogen fragment);antiangiogenic antithrombin III; angiozyme; ABT-627; Bay 12-9566;Benefin; Bevacizumab; BMS-275291; cartilage-derived inhibitor (CDI);CAI; CD59 complement fragment; CEP-7055; Col 3; combretastatin A-4;endostatin (collagen XVIII fragment); fibronectin fragment; Gro-beta;Halofuginone; Heparinases; Heparin hexasaccharide fragment; HMV833;human chorionic gonadotropin (hCG); IM-862; Interferon alpha/beta/gamma;Interferon inducible protein (IP-10); Interleukin-12; Kringle 5(plasminogen fragment); Marimastat; Metalloproteinase inhibitors(TIMPs); 2-methoxyestradiol; MMI 270 (CGS 27023A); MoAb IMC-1C11;Neovastat; NM-3; Panzem; PI-88; Placental ribonuclease inhibitor;plasminogen activator inhibitor; platelet factor-4 (PF4); Prinomastat;Prolactin 16 kD fragment; Proliferin-related protein (PRP); PTK 787/ZK222594; retinoids; solimastat; squalamine; SS 3304; SU 5416; SU6668;SU11248; tetrahydrocortisol-S; tetrathiomolybdate; thalidomide;thrombospondin-1 (TSP-1); TNP-470; transforming growth factor-beta;vasculostatin; vasostatin (calreticulin fragment); ZD6126; ZD 6474;farnesyl transferase inhibitors (FTI); and bisphosphonates. In aspecific embodiment, anti-angiogenic agents do not include antibodies orfragments thereof that immunospecifically bind to integrin α_(V)β₃.

Specific examples of propylactic or therapeutic agents which can be usedin accordance with the methods of the invention to prevent, treat,manage or ameliorate a proliferative disorder associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay, or a symptom thereof include, but notlimited to: acivicin; aclarubicin; acodazole hydrochloride; acronine;adozelesin; aldesleukin; altretamine; ambomycin; ametantrone acetate;aminoglutethimide; amsacrine; anastrozole; anthramycin; asparaginase;asperlin; azacitidine; azetepa; azotomycin; batimastat; benzodepa;bicalutamide; bisantrene hydrochloride; bisnafide dimesylate; bizelesin;bleomycin sulfate; brequinar sodium; bropirimine; busulfan;cactinomycin; calusterone; caracemide; carbetimer, carboplatin;carmustine; carubicin hydrochloride; carzelesin; cedefingol;chlorambucil; cirolemycin; cisplatin; cladribine; crisnatol mesylate;cyclophosphamide; cytarabine; dacarbazine; dactinomycin; daunorubicinhydrochloride; decitabine; dexormaplatin; dezaguanine; dezaguaninemesylate; diaziquone; docetaxel; doxorubicin; doxorubicin hydrochloride;droloxifene; droloxifene citrate; dromostanolone propionate; duazomycin;edatrexate; eflornithine hydrochloride; elsamitrucin; enloplatin;enpromate; epipropidine; epirubicin hydrochloride; erbulozole;esorubicin hydrochloride; estramustine; estramustine phosphate sodium;etanidazole; etoposide; etoposide phosphate; etoprine; fadrozolehydrochloride; fazarabine; fenretinide; floxuridine; fludarabinephosphate; fluorouracil; flurocitabine; fosquidone; fostriecin sodium;gemcitabine; gemcitabine hydrochloride; hydroxyurea; idarubicinhydrochloride; ifosfamide; ilmofosine; interleukin II (includingrecombinant interleukin II, or rIL2), interferon alpha-2a; interferonalpha-2b; interferon alpha-n1; interferon alpha-n3; interferon beta-I a;interferon gamma-I b; iproplatin; irinotecan hydrochloride; lanreotideacetate; letrozole; leuprolide acetate; liarozole hydrochloride;lometrexol sodium; lomustine; losoxantrone hydrochloride; masoprocol;maytansine; mechlorethamine hydrochloride; megestrol acetate;melengestrol acetate; melphalan; menogaril; mercaptopurine;methotrexate; methotrexate sodium; metoprine; meturedepa; mitindomide;mitocarcin; mitocromin; mitogillin; mitomalcin; mitomycin; mitosper;mitotane; mitoxantrone hydrochloride; mycophenolic acid; nocodazole;nogalamycin; ormaplatin; oxisuran; paclitaxel; pegaspargase; peliomycin;pentamustine; peplomycin sulfate; perfosfamide; pipobroman; piposulfan;piroxantrone hydrochloride; plicamycin; plomestane; porfimer sodium;porfiromycin; prednimustine; procarbazine hydrochloride; puromycin;puromycin hydrochloride; pyrazofurin; riboprine; rogletimide; safingol;safingol hydrochloride; semustine; simtrazene; sparfosate sodium;sparsomycin; spirogermanium hydrochloride; spiromustine; spiroplatin;streptonigrin; streptozocin; sulofenur, talisomycin; tecogalan sodium;tegafur; teloxantrone hydrochloride; temoporfin; teniposide; teroxirone;testolactone; thiamiprine; thioguanine; thiotepa; tiazofurin;tirapazamine; toremifene citrate; trestolone acetate; triciribinephosphate; trimetrexate; trimetrexate glucuronate; triptorelin;tubulozole hydrochloride; uracil mustard; uredepa; vapreotide;verteporfin; vinblastine sulfate; vincristine sulfate; vindesine;vindesine sulfate; vinepidine sulfate; vinglycinate sulfate;vinleurosine sulfate; vinorelbine tartrate; vinrosidine sulfate;vinzolidine sulfate; vorozole; zeniplatin; zinostatin; zorubicinhydrochloride. Other anti-cancer drugs include, but are not limited to:20-epi-1,25 dihydroxyvitamin D3; 5-ethynyluracil; abiraterone;aclarubicin; acylfulvene; adecypenol; adozelesin; aldesleukin; ALL-TKantagonists; altretamine; ambamustine; amidox; amifostine;aminolevulinic acid; amrubicin; amsacrine; anagrelide; anastrozole;andrographolide; angiogenesis inhibitors; antagonist D; antagonist G;antarelix; anti-dorsalizing morphogenetic protein-1; antiandrogen,prostatic carcinoma; antiestrogen; antineoplaston; antisenseoligonucleotides; aphidicolin glycinate; apoptosis gene modulators;apoptosis regulators; apurinic acid; ara-CDP-DL-PTBA; argininedeaminase; asulacrine; atamestane; atrimustine; axinastatin 1;axinastatin 2; axinastatin 3; azasetron; azatoxin; azatyrosine; baccatinIII derivatives; balanol; batimastat; BCR/ABL antagonists;benzochlorins; benzoylstaurosporine; beta lactam derivatives;beta-alethine; betaclamycin B; betulinic acid; bFGF inhibitor;bicalutamide; bisantrene; bisaziridinylspermine; bisnafide; bistrateneA; bizelesin; breflate; bropirimine; budotitane; buthionine sulfoximine;calcipotriol; calphostin C; camptothecin derivatives; canarypox IL-2;capecitabine; carboxamide-amino-triazole; carboxyamidotriazole; CaRestM3; CARN 700; cartilage derived inhibitor, carzelesin; casein kinaseinhibitors (ICOS); castanospermine; cecropin B; cetrorelix; chlorlns;chloroquinoxaline sulfonamide; cicaprost; cis-porphyrin; cladribine;clomifene analogues; clotrimazole; collismycin A; collismycin B;combretastatin A4; combretastatin analogue; conagenin; crambescidin 816;crisnatol; cryptophycin 8; cryptophycin A derivatives; curacin A;cyclopentanthraquinones; cycloplatam; cypemycin; cytarabine ocfosfate;cytolytic factor; cytostatin; dacliximab; decitabine; dehydrodidemnin B;deslorelin; dexamethasone; dexifosfamide; dexrazoxane; dexverapamil;diaziquone; didemnin B; didox; diethylnorspermine;dihydro-5-azacytidine; dihydrotaxol, 9-; dioxamycin; diphenylspiromustine; docetaxel; docosanol; dolasetron; doxifluridine;droloxifene; dronabinol; duocarmycin SA; ebselen; ecomustine;edelfosine; edrecolomab; eflornithine; elemene; emitefur; epirubicin;epristeride; estramustine analogue; estrogen agonists; estrogenantagonists; etanidazole; etoposide phosphate; exemestane; fadrozole;fazarabine; fenretinide; filgrastim; finasteride; flavopiridol;flezelastine; fluasterone; fludarabine; fluorodaunorunicinhydrochloride; forfenimex; formestane; fostriecin; fotemustine;gadolinium texaphyrin; gallium nitrate; galocitabine; ganirelix;gelatinase inhibitors; gemcitabine; glutathione inhibitors; hepsulfam;heregulin; hexamethylene bisacetamide; hypericin; ibandronic acid;idarubicin; idoxifene; idramantone; ilmofosine; ilomastat;imidazoacridones; imiquimod; immunostimulant peptides; insulin-likegrowth factor-1 receptor inhibitor; interferon agonists; interferons;interleukins; iobenguane; iododoxorubicin; ipomeanol, 4-; iroplact;irsogladine; isobengazole; isohomohalicondrin B; itasetron;jasplakinolide; kahalalide F; lamellarin-N triacetate; lanreotide;leinamycin; lenograstim; lentinan sulfate; leptolstatin; letrozole;leukemia inhibiting factor, leukocyte alpha interferon;leuprolide+estrogen+progesterone; leuprorelin; levamisole; liarozole;linear polyamine analogue; lipophilic disaccharide peptide; lipophilicplatinum compounds; lissoclinamide 7; lobaplatin; lombricine;lometrexol; lonidamine; losoxantrone; lovastatin; loxoribine;lurtotecan; lutetium texaphyrin; lysofylline; lytic peptides;maitansine; mannostatin A; marimastat; masoprocol; maspin; matrilysininhibitors; matrix metalloproteinase inhibitors; menogaril; merbarone;meterelin; methioninase; metoclopramide; MIF inhibitor; mifepristone;miltefosine; mirimostim; mismatched double stranded RNA; mitoguazone;mitolactol; mitomycin analogues; mitonafide; mitotoxin fibroblast growthfactor-saporin; mitoxantrone; mofarotene; molgramostim; monoclonalantibody, human chorionic gonadotrophin; monophosphoryl lipidA+myobacterium cell wall sk; mopidamol; multiple drug resistance geneinhibitor; multiple tumor suppressor 1-based therapy; mustard anticanceragent; mycaperoxide B; mycobacterial cell wall extract; myriaporone;N-acetyldinaline; N-substituted benzamides; nafarelin; nagrestip;naloxone+pentazocine; napavin; naphterpin; nartograstim; nedaplatin;nemorubicin; neridronic acid; neutral endopeptidase; nilutamide;nisamycin; nitric oxide modulators; nitroxide antioxidant; nitrullyn;O6-benzylguanine; octreotide; okicenone; oligonucleotides; onapristone;ondansetron; ondansetron; oracin; oral cytokine inducer; ormaplatin;osaterone; oxaliplatin; oxaunomycin; paclitaxel; paclitaxel analogues;paclitaxel derivatives; palauamine; palmitoylrhizoxin; pamidronic acid;panaxytriol; panomifene; parabactin; pazelliptine; pegaspargase;peldesine; pentosan polysulfate sodium; pentostatin; pentrozole;perflubron; perfosfamide; perillyl alcohol; phenazinomycin;phenylacetate; phosphatase inhibitors; picibanil; pilocarpinehydrochloride; pirarubicin; piritrexim; placetin A; placetin B;plasminogen activator inhibitor, platinum complex; platinum compounds;platinum-triamine complex; porfimer sodium; porfiromycin; prednisone;propyl bis-acridone; prostaglandin J2; proteasome inhibitors; proteinA-based immune modulator; protein kinase C inhibitor; protein kinase Cinhibitors, microalgal; protein tyrosine phosphatase inhibitors; purinenucleoside phosphorylase inhibitors; purpurins; pyrazoloacridine;pyridoxylated hemoglobin polyoxyethylene conjugate; raf antagonists;raltitrexed; ramosetron; ras farnesyl protein transferase inhibitors;ras inhibitors; ras-GAP inhibitor; retelliptine demethylated; rhenium Re186 etidronate; rhizoxin; ribozymes; RII retinamide; rogletimide;rohitukine; romurtide; roquinimex; rubiginone B1; ruboxyl; safingol;saintopin; SarCNU; sarcophytol A; sargramostim; Sdi 1 mimetics;semustine; senescence derived inhibitor 1; sense oligonucleotides;signal transduction inhibitors; signal transduction modulators; singlechain antigen binding protein; sizofiran; sobuzoxane; sodiumborocaptate; sodium phenylacetate; solverol; somatomedin bindingprotein; sonermin; sparfosic acid; spicamycin D; spiromustine;splenopentin; spongistatin 1; squalamine; stem cell inhibitor; stem-celldivision inhibitors; stipiamide; stromelysin inhibitors; sulfinosine;superactive vasoactive intestinal peptide antagonist; suradista;suramin; swainsonine; synthetic glycosaminoglycans; tallimustine;5-fluorouracil; leucovorin; tamoxifen methiodide; tauromustine;tazarotene; tecogalan sodium; tegafur; tellurapyrylium; telomeraseinhibitors; temoporfin; temozolomide; teniposide; tetrachlorodecaoxide;tetrazomine; thaliblastine; thiocoraline; thrombopoietin; thrombopoietinmimetic; thymalfasin; thymopoietin receptor agonist; thymotrinan;thyroid stimulating hormone; tin ethyl etiopurpurin; tirapazamine;titanocene bichloride; topsentin; toremifene; totipotent stem cellfactor; translation inhibitors; tretinoin; triacetyluridine;triciribine; trimetrexate; triptorelin; tropisetron; turosteride;tyrosine kinase inhibitors; tyrphostins; UBC inhibitors; ubenimex;urogenital sinus-derived growth inhibitory factor, urokinase receptorantagonists; vapreotide; variolin B; vector system, erythrocyte genetherapy; thalidomide; velaresol; veramine; verdins; verteporfin;vinorelbine; vinxaltine; vorozole; zanoterone; zeniplatin; zilascorb;and zinostatin stimalamer.

Specific examples of propylactic or therapeutic agents which can be usedin accordance with the methods of the invention to prevent, treat,manage and/or ameliorate a central nervous system disorders associatedwith, characterized by or caused by premature translation terminationand/or nonsense-mediated mRNA decay, or a symptom thereof include, butare not limited to: Levodopa, L-DOPA, cocaine, α-methyl-tyrosine,reserpine, tetrabenazine, benzotropine, pargyline, fenodolpam mesylate,cabergoline, pramipexole dihydrochloride, ropinorole, amantadinehydrochloride, selegiline hydrochloride, carbidopa, pergolide mesylate,Sinemet CR, or Symmetrel.

Specific examples of propylactic or therapeutic agents which can be usedin accordance with the methods of the invention to prevent, treat,manage and/or ameliorate a metabolic disorders associated with,characterized by or caused by premature translation termination and/ornonsense-mediated mRNA decay, or a symptom thereof include, but are notlimited to: a monoamine oxidase inhibitor (MAO), for example, but notlimited to, iproniazid, clorgyline, phenelzine and isocarboxazid; anacetylcholinesterase inhibitor, for example, but not limited to,physostigmine saliclate, physostigmine sulfate, physostigmine bromide,meostigmine bromide, neostigmine methylsulfate, ambenonim chloride,edrophonium chloride, tacrine, pralidoxime chloride, obidoxime chloride,trimedoxime bromide, diacetyl monoxim, endrophonium, pyridostigmine, anddemecarium; an anti-inflammatory agent, including, but not limited to,naproxen sodium, diclofenac sodium, diclofenac potassium, celecoxib,sulindac, oxaprozin, diflunisal, etodolac, meloxicam, ibuprofen,ketoprofen, nabumetone, refecoxib, methotrexate, leflunomide,sulfasalazine, gold salts, RHo-D Immune Globulin, mycophenylate mofetil,cyclosporine, azathioprine, tacrolimus, basiliximab, daclizumab,salicylic acid, acetylsalicylic acid, methyl salicylate, diflunisal,salsalate, olsalazine, sulfasalazine, acetaminophen, indomethacin,sulindac, mefenamic acid, meclofenamate sodium, tolmetin, ketorolac,dichlofenac, flurbinprofen, oxaprozin, piroxicam, meloxicam,ampiroxicam, droxicam, pivoxicam, tenoxicam, phenylbutazone,oxyphenbutazone, antipyrine, aminopyrine, apazone, zileuton,aurothioglucose, gold sodium thiomalate, auranofin, methotrexate,colchicine, allopurinol, probenecid, sulfinpyrazone and benzbromarone orbetamethasone and other glucocorticoids; an antiemetic agent, forexample, but not limited to, metoclopromide, domperidone,prochlorperazine, promethazine, chlorpromazine, trimethobenzamide,ondansetron, granisetron, hydroxyzine, acetylleucine monoethanolamine,alizapride, azasetron, benzquinamide, bietanautine, bromopride,buclizine, clebopride, cyclizine, dimenhydrinate, diphenidol,dolasetron, meclizine, methallatal, metopimazine, nabilone, oxyperndyl,pipamazine, scopolamine, sulpiride, tetrahydrocannabinol,thiethylperazine, thioproperazine, tropisetron, and mixtures thereof.

5.12 Compounds and Methods of Administering Compounds

Biologically active compounds identified using the methods of theinvention or a pharmaceutically acceptable salt thereof can beadministered to a patient, preferably a mammal, more preferably a human,suffering from a disorder (in particular, a disorder associated with,characterized by or caused by premature translation termination and/ornonsense mediated mRNA decay). In a specific embodiment, a compound or apharmaceutically acceptable salt thereof is administered to a patient,preferably a mammal, more preferably a human, as a preventative measureagainst a disorder (in particular, a disorder associated with,characterized by or caused by premature translation termination and/ornonsense mediated mRNA decay).

In one embodiment, the compound or a pharmaceutically acceptable saltthereof is administered as a preventative measure to a patient.According to this embodiment, the patient can have a geneticpredisposition to a disease, such as a family history of the disease, ora non-genetic predisposition to the disease. Accordingly, the compoundand pharmaceutically acceptable salts thereof can be used for thetreatment of one manifestation of a disease and prevention of another.

A compound identified in accordance with the invention, or apharmaceutically acceptable salt thereof, may be a component of acomposition optionally comprising a carrier, diluent or excipient. Whenadministered to a patient, the compound or a pharmaceutically acceptablesalt thereof is preferably administered as component of a compositionthat optionally comprises a pharmaceutically acceptable vehicle. Thecomposition can be administered orally, or by any other convenientroute, for example, by infusion or bolus injection, by absorptionthrough epithelial or mucocutaneous linings (e.g., oral mucosa, rectal,and intestinal mucosa, etc.) and may be administered together withanother biologically active agent. Administration can be systemic orlocal. Various delivery systems are known, e.g., encapsulation inliposomes, microparticles, microcapsules, capsules, etc., and can beused to administer the compound and pharmaceutically acceptable saltsthereof.

Methods of administration include but are not limited to intradermal,intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal,epidural, oral, sublingual, intranasal, intracerebral, intravaginal,transdermal, rectally, by inhalation, or topically, particularly to theears, nose, eyes, or skin. The mode of administration is left to thediscretion of the practitioner. In most instances, administration willresult in the release of the compound or a pharmaceutically acceptablesalt thereof into the bloodstream.

In specific embodiments, it may be desirable to administer the compoundor a pharmaceutically acceptable salt thereof locally. This may beachieved, for example, and not by way of limitation, by local infusionduring surgery, topical application, e.g. in conjunction with a wounddressing after surgery, by injection, by means of a catheter, by meansof a suppository, or by means of an implant, said implant being of aporous, non-porous, or gelatinous material, including membranes, such assialastic membranes, or fibers.

In certain embodiments, it may be desirable to introduce the compound ora pharmaceutically acceptable salt thereof into the central nervoussystem by any suitable route, including intraventricular, intrathecaland epidural injection. Intraventricular injection may be facilitated byan intraventricular catheter, for example, attached to a reservoir, suchas an Ommaya reservoir.

Pulmonary administration can also be employed, e.g., by use of aninhaler or nebulizer, and formulation with an aerosolizing agent, or viaperfusion in a fluorocarbon or synthetic pulmonary surfactant. Incertain embodiments, the compound and pharmaceutically acceptable saltsthereof can be formulated as a suppository, with traditional binders andvehicles such as triglycerides.

In another embodiment, the compound and pharmaceutically acceptablesalts thereof can be delivered in a vesicle, in particular a liposome(see Langer, 1990, Science 249:1527-1533; Treat et al., in Liposomes inthe Therapy of Infectious Disease and Cancer, Lopez-Berestein and Fidler(eds.), Liss, New York, pp. 353-365 (1989); Lopez-Berestein, ibid., pp.317-327; see generally ibid.).

In yet another embodiment, the compound and pharmaceutically acceptablesalts thereof can be delivered in a controlled release system (see,e.g., Goodson, in Medical Applications of Controlled Release, supra,vol. 2, pp. 115-138 (1984)). Other controlled-release systems discussedin the review by Langer, 1990, Science 249:1527-1533 may be used. In oneembodiment, a pump may be used (see Langer, supra; Sefton, 1987, CRCCrit. Ref. Biomed. Eng. 14:201; Buchwald et al., 1980, Surgery 88:507;Saudek et al., 1989, N. Engl. J. Med. 321:574). In another embodiment,polymeric materials can be used (see Medical Applications of ControlledRelease, Langer and Wise (eds.), CRC Pres., Boca Raton, Fla. (1974);Controlled Drug Bioavailability, Drug Product Design and Performance,Smolen and Ball (eds.), Wiley, New York (1984); Ranger and Peppas, 1983,J. Macromol. Sci. Rev. Macromol. Chem. 23:61; see also Levy et al.,1985, Science 228:190; During et al., 1989, Ann. Neurol. 25:351; Howardet al., 1989, J. Neurosurg. 71:105). In yet another embodiment, acontrolled-release system can be placed in proximity of a target RNA ofthe compound or a pharmaceutically acceptable salt thereof, thusrequiring only a fraction of the systemic dose.

Compositions comprising the compound or a pharmaceutically acceptablesalt thereof (“compound compositions”) can additionally comprise asuitable amount of a pharmaceutically acceptable vehicle so as toprovide the form for proper administration to the patient.

In a specific embodiment, the term “pharmaceutically acceptable” meansapproved by a regulatory agency of the Federal or a state government orlisted in the U.S. Pharmacopeia or other generally recognizedpharmacopeia for use in animals, mammals, and more particularly inhumans. The term “vehicle” refers to a diluent, adjuvant, excipient, orcarrier with which a compound of the invention is administered. Suchpharmaceutical vehicles can be liquids, such as water and oils,including those of petroleum, animal, vegetable or synthetic origin,such as peanut oil, soybean oil, mineral oil, sesame oil and the like.The pharmaceutical vehicles can be saline, gum acacia, gelatin, starchpaste, talc, keratin, colloidal silica, urea, and the like. In addition,auxiliary, stabilizing, thickening, lubricating and coloring agents maybe used. When administered to a patient, the pharmaceutically acceptablevehicles are preferably sterile. Water is a preferred vehicle when thecompound of the invention is administered intravenously. Salinesolutions and aqueous dextrose and glycerol solutions can also beemployed as liquid vehicles, particularly for injectable solutions.Suitable pharmaceutical vehicles also include excipients such as starch,glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silicagel, sodium stearate, glycerol monostearate, talc, sodium chloride,dried skim milk, glycerol, propylene, glycol, water, ethanol and thelike. Compound compositions, if desired, can also contain minor amountsof wetting or emulsifying agents, or pH buffering agents.

Compound compositions can take the form of solutions, suspensions,emulsion, tablets, pills, pellets, capsules, capsules containingliquids, powders, sustained-release formulations, suppositories,emulsions, aerosols, sprays, suspensions, or any other form suitable foruse. In one embodiment, the pharmaceutically acceptable vehicle is acapsule (see e.g., U.S. Pat. No. 5,698,155). Other examples of suitablepharmaceutical vehicles are described in Remington's PharmaceuticalSciences, Alfonso R. Gennaro, ed., Mack Publishing Co. Easton, Pa., 19thed., 1995, pp. 1447 to 1676, incorporated herein by reference.

In a preferred embodiment, the compound or a pharmaceutically acceptablesalt thereof is formulated in accordance with routine procedures as apharmaceutical composition adapted for oral administration to humanbeings. Compositions for oral delivery may be in the form of tablets,lozenges, aqueous or oily suspensions, granules, powders, emulsions,capsules, syrups, or elixirs, for example. Orally administeredcompositions may contain one or more agents, for example, sweeteningagents such as fructose, aspartame or saccharin; flavoring agents suchas peppermint, oil of wintergreen, or cherry; coloring agents; andpreserving agents, to provide a pharmaceutically palatable preparation.Moreover, where in tablet or pill form, the compositions can be coatedto delay disintegration and absorption in the gastrointestinal tractthereby providing a sustained action over an extended period of time.Selectively permeable membranes surrounding an osmotically activedriving compound are also suitable for orally administered compositions.In these later platforms, fluid from the environment surrounding thecapsule is imbibed by the driving compound, which swells to displace theagent or agent composition through an aperture. These delivery platformscan provide an essentially zero order delivery profile as opposed to thespiked profiles of immediate release formulations. A time delay materialsuch as glycerol monostearate or glycerol stearate may also be used.Oral compositions can include standard vehicles such as mannitol,lactose, starch, magnesium stearate, sodium saccharine, cellulose,magnesium carbonate, and the like. Such vehicles are preferably ofpharmaceutical grade. Typically, compositions for intravenousadministration comprise sterile isotonic aqueous buffer. Wherenecessary, the compositions may also include a solubilizing agent.

In another embodiment, the compound or a pharmaceutically acceptablesalt thereof can be formulated for intravenous administration.Compositions for intravenous administration may optionally include alocal anesthetic such as lignocaine to lessen pain at the site of theinjection. Generally, the ingredients are supplied either separately ormixed together in unit dosage form, for example, as a dry lyophilizedpowder or water-free concentrate in a hermetically sealed container suchas an ampoule or sachette indicating the quantity of active agent. Wherethe compound or a pharmaceutically acceptable salt thereof is to beadministered by infusion, it can be dispensed, for example, with aninfusion bottle containing sterile pharmaceutical grade water or saline.Where the compound or a pharmaceutically acceptable salt thereof isadministered by injection, an ampoule of sterile water for injection orsaline can be provided so that the ingredients may be mixed prior toadministration.

The amount of a compound or a pharmaceutically acceptable salt thereofthat will be effective in the treatment of a particular disease willdepend on the nature of the disease, and can be determined by standardclinical techniques. In addition, in vitro or in vivo assays mayoptionally be employed to help identify optimal dosage ranges. Theprecise dose to be employed will also depend on the route ofadministration, and the seriousness of the disease, and should bedecided according to the judgment of the practitioner and each patient'scircumstances. However, suitable dosage ranges for oral administrationare generally about 0.001 milligram to about 500 milligrams of acompound or a pharmaceutically acceptable salt thereof per kilogram bodyweight per day. In specific preferred embodiments of the invention, theoral dose is about 0.01 milligram to about 100 milligrams per kilogrambody weight per day, more preferably about 0.1 milligram to about 75milligrams per kilogram body weight per day, more preferably about 0.5milligram to 5 milligrams per kilogram body weight per day. The dosageamounts described herein refer to total amounts administered; that is,if more than one compound is administered, or if a compound isadministered with a therapeutic agent, then the preferred dosagescorrespond to the total amount administered. Oral compositionspreferably contain about 10% to about 95% active ingredient by weight.

Suitable dosage ranges for intravenous (i.v.) administration are about0.01 milligram to about 100 milligrams per kilogram body weight per day,about 0.1 milligram to about 35 milligrams per kilogram body weight perday, and about 1 milligram to about 10 milligrams per kilogram bodyweight per day. Suitable dosage ranges for intranasal administration aregenerally about 0.01 pg/kg body weight per day to about 1 mg/kg bodyweight per day. Suppositories generally contain about 0.01 milligram toabout 50 milligrams of a compound of the invention per kilogram bodyweight per day and comprise active ingredient in the range of about 0.5%to about 10% by weight.

Recommended dosages for intradermal, intramuscular, intraperitoneal,subcutaneous, epidural, sublingual, intracerebral, intravaginal,transdermal administration or administration by inhalation are in therange of about 0.001 milligram to about 200 milligrams per kilogram ofbody weight per day. Suitable doses for topical administration are inthe range of about 0.001 milligram to about 1 milligram, depending onthe area of administration. Effective doses may be extrapolated fromdose-response curves derived from in vitro or animal model test systems.Such animal models and systems are well known in the art.

The compound and pharmaceutically acceptable salts thereof arepreferably assayed in vitro and in vivo, for the desired therapeutic orprophylactic activity, prior to use in humans. For example, in vitroassays can be used to determine whether it is preferable to administerthe compound, a pharmaceutically acceptable salt thereof, and/or anothertherapeutic agent. Animal model systems can be used to demonstratesafety and efficacy.

6. EXAMPLES 6.1 Example Identification of a Dye-Labeled Target RNA Boundto Small Molecular Weight Compounds

The results presented in this Example indicate that gel mobility shiftassays can be used to detect the binding of small molecules, such as theTat peptide and gentamicin, to their respective target RNAs.

Materials and Methods Buffers

Tris-potassium chloride (TK) buffer is composed of 50 mM Tris-HCl pH7.4, 20 mM KCl, 0.1% Triton X-100, and 0.5 mM MgCl₂. Tris-borate-EDTA(TBE) buffer is composed of 45 mM Tris-borate pH 8.0, and 1 mM EDTA.Tris-Potassium chloride-magnesium (TKM) buffer is composed of 50 mMTris-HCl pH 7.4, 20 mM KCl, 0.1% Triton X-100 and 5 mM MgCl₂.

Gel Retardation Analysis

RNA oligonucleotides were purchased from Dharmacon, Inc, Lafayette,Colo.). 500 pmole of either a 5′ fluorescein labeled oligonucleotidecorresponding to the 16S rRNA A site (5′-GGCGUCACACCUUCGGGUGAAGUCGCC-3′(SEQ ID NO:1); Moazed & Noller, 1987, Nature 327:389-394; Woodcock etal., 1991, EMBO J. 10:3099-3103; Yoshizawa et al., 1998, EMBO J.17:6437-6448) or a 5′ fluorescein labeled oligonucleotide correspondingto the HIV-1 TAR element TAR RNA (5′-GGCAGAUCUGAGCCUGGGAGCUCUCUGCC-3′(SEQ ID NO: 2); Huq et al., 1999, Nucleic Acids Research. 27:1084-1093;Hwang et al., 1999, Proc. Natl. Acad. Sci. USA 96:12997-13002) was 3′labeled with 5′-³²P cytidine 3′,5′-bis(phosphate) (NEN) and T4 RNAligase (NEBiolabs) in 10% DMSO as per manufacturer's instructions. Thelabeled oligonucleotides were purified using G-25 Sephadex columns(Boehringer Mannheim). For Tat-TAR gel retardation reactions the methodof Huq et al. (Nucleic Acids Research, 1999, 27:1084-1093) was utilizedwith TK buffer containing 0.5 mM MgCl₂ and a 12-mer Tat peptide(YGRKKRRQRRRP (SEQ ID NO: 3; single letter amino acid code). For 16SrRNA-gentamicin reactions, the method of Huq et al was used with TKMbuffer. In 20 μl reaction volumes 50 pmoles of ³²P cytidine-labeledoligonucleotide and either gentamicin sulfate (Sigma) or the short Tatpeptide (Tat₄₇₋₅₈) in TK or TKM buffer were heated at 90° C. for 2minutes and allow to cool to room temperature (approximately 24° C.)over 2 hours. Then 10 μl of 30% glycerol was added to each reaction tubeand the entire sample was loaded onto a TBE non-denaturingpolyacrylamide gel and electrophoresed at 1200-1600 volt-hours at 4° C.The gel was exposed to an intensifying screen and radioactivity wasquantitated using a Typhoon phosporimager molecular Dynamics).

Background

One method used to demonstrate small molecule interactions with naturaloccurring RNA structures such as ribosomes is by a method calledchemical footprinting or toe printing (Moazed & Noller, 1987, Nature327:389-394; Woodcock et al., 1991, EMBO J. 10:3099-3103; Yoshizawa etal., 1998, EMBO J. 17:6437-6448). Here the use of gel mobility shiftassays to monitor RNA-small molecule interactions are described. Thisapproach allows for rapid visualization of small molecule-RNAinteractions based on the difference between mobility of RNA aloneversus RNA in a complex with a small molecule. To validate thisapproach, an RNA oligonucleotide corresponding to the well-characterizedgentamicin binding site on the 16S rRNA (Moazed & Noller, 1987, Nature327:389-394) and the equally well-characterized HIV-1 TAT proteinbinding site on the HIV-1 TAR element (Huq et al., 1999, Nucleic AcidsRes. 27: 1084-1093) were chosen. The purpose of these experiments is tolay the groundwork for the use of chromatographic techniques in a highthroughput fashion, such as microcapillary electrophoresis, for drugdiscovery.

Results

A gel retardation assay was performed using the Tat₄₇₋₅₈ peptide and theTAR RNA oligonucleotide. As shown in FIG. 2, in the presence of the Tatpeptide, a clear shift is visible when the products are separated on a12% non-denaturing polyacrylamide gel. In the reaction that lackspeptide, only the free RNA is visible. These observations confirmprevious reports made using other Tat peptides (Hamy et al., 1997, Proc.Natl. Acad. Sci. USA 94:3548-3553; Huq et al., 1999, Nucleic Acids Res.27: 1084-1093).

Based on the results of FIG. 2, it was hypothesized that RNAinteractions with small organic molecules could also be visualized usingthis method. As shown in FIG. 3, the addition of varying concentrationsof gentamicin to an RNA oligonucleotide corresponding to the 16S rRNA Asite produces a mobility shift. These results demonstrate that thebinding of the small molecule gentamicin to an RNA oligonucleotidehaving a defined structure in solution can be monitored using thisapproach. In addition, as shown in FIG. 3, a concentration as low as 10ng/ml gentamicin produces the mobility shift.

To determine whether lower concentrations of gentamicin would besufficient to produce a gel shift, a similar experiment was performed,as shown in FIG. 3, except that the concentrations of gentamicin rangedfrom 100 ng/ml to 10 pg/ml. As shown in FIG. 4, gel mobility shifts areproduced when the gentamicin concentration is as low as 10 pg/ml.Further, the results shown in FIG. 4 demonstrate that the shift isspecific to the 16S rRNA oligonucleotide as the use of an unrelatedoligonucleotide, corresponding to the HIV TAR RNA element, does notresult in a gel mobility shift when incubated with 10 mg/ml gentamicin.In addition, if a concentration as low as 10 pg/ml gentamicin produces agel mobility shift then it should be possible to detect changes to RNAstructural motifs when small amounts of compound from a library ofdiverse compounds is screened in this fashion.

Further analysis of the gentamicin-RNA interaction indicates that theinteraction is Mg— and temperature dependent. As shown in FIG. 5, whenMgCl₂ is not present (TK buffer), 1 mg/ml of gentamicin must be added tothe reaction to produce a gel shift. Similarly, the temperature of thereaction when gentamicin is added is also important. When gentamicin ispresent in the reaction during the entire denaturation/renaturationcycle, that is, when gentamicin is added at 90C° C. or 85° C., a gelshift is visualized (data not shown). In contrast, when gentamicin isadded after the renaturation step has proceeded to 75° C., a mobilityshift is not produced. These results are consistent with the notion thatgentamicin may recognize and interact with an RNA structure formed earlyin the renaturation process.

6.2 Example Identification of a Dye-Labeled Target RNA Bound to SmallMolecular Weight Compounds by Capillary Electrophoresis

The results presented in this Example indicate that interactions betweena peptide and its target RNA, such as the Tat peptide and TAR RNA, canbe monitored by gel retardation assays in an automated capillaryelectrophoresis system.

Materials and Methods Buffers

Tris-potassium chloride (TK) buffer is composed of 50 mM Tris-HCl pH7.4, 20 mM KCl, 0.1% Triton X-100, and 0.5 mM MgC₂. Tris-borate-EDTA(IBE) buffer is composed of 45 mM Tris-borate pH 8.0, and 1 mM EDTA.Tris-Potassium chloride-magnesium (TKM) buffer is composed of 50 mMTris-HCl pH 7.4, 20 mM KCl, 0.1% Triton X-100 and 5 mM MgCl₂.

Gel Retardation Analysis Using Capillary Electrophoresis

RNA oligonucleotides were purchased from Dharmacon, Inc, Lafayette,Colo.). 500 pmole of a 5′ fluorescein labeled oligonucleotidecorresponding to the HIV-1 TAR element TAR RNA(5′-GGCAGAUCUGAGCCUGGGAGCUCUCUGCC-3′ (SEQ ID NO: 2); Huq et al., 1999,Nucleic Acids Research. 27:1084-1093; Hwang et al., 1999, Proc. Natl.Acad. Sci. USA 96:12997-13002) was used. For Tat-TAR gel retardationreactions the method of Huq et al. (Nucleic Acids Research, 1999,27:1084-1093) was utilized with TK buffer containing 0.5 mM MgCl₂ and a12-mer Tat peptide (YGRKKRRQRRRP (SEQ ID NO: 3); single letter aminoacid code). In 20 μl reaction volumes 50 pmoles of labeledoligonucleotide and the short Tat peptide (Tat₄₇₋₅₈) in TK or TKM bufferwere heated at 90° C. for 2 minutes and allow to cool to roomtemperature (approximately 24° C.) over 2 hours. The reactions wereloaded onto a SCE9610 automated capillary electrophoresis apparatus(SpectruMedix; State College, Pennsylvania).

Results

As presented in the previous sections of the Example 6.1, interactionsbetween a peptide and RNA can be monitored by gel retardation assays. Itwas hypothesized that interactions between a peptide and RNA could bemonitored by gel retardation assays by an automated capillaryelectrophoresis system. To test this hypothesis, a gel retardation assayby an automated capillary electrophoresis system was performed using theTat₄₇₋₅₈ peptide and the TAR RNA oligonucleotide. As shown in FIG. 6using the capillary electrophoresis system, in the presence of the Tatpeptide, a clear shift is visible upon the addition of increasingconcentrations of Tat peptide. In the reaction that lacks peptide, onlya peak corresponding to the free RNA is observed. These observationsconfirm previous reports made using other Tat peptides (Hamy et al.,1997, Proc. Natl. Acad. Sci. USA 94:3548-3553; Huq et al., 1999, NucleicAcids Res. 27: 1084-1093).

6.3 Example Compounds that Modulate Translation Termination BindSpecific Regions of 28S RRNA

Data is presented in this Example that demonstrates that specificregions of the 28S rRNA are involved in modulating translationtermination in mammalian cells. Compounds that interact in these regionsor modulate local changes within these regions of the ribosome (e.g.,alter base pairing interactions, base modification or modulate bindingof trans-acting factors that bind to these regions) have the potentialto modulate translation termination. These regions are conserved fromprokaryotes to eukaryotes, but the role of these regions in modulatingtranslation termination has not been realized in eukaryotes. Inbacteria, when a short RNA fragment, complementary to the E. coli 23SrRNA segment comprising nucleotides 735 to 766 (in domain II), isexpressed in vivo, suppression of UGA nonsense mutations, but not UAA ofUAG, results (Chernyaeva et al., 1999, J Bacteriol 181:5257-5262). Otherregions of the 23S rRNA in E. coli have been implicated in nonsensesuppression including the GTPase center in domain II (nt 1034-1120;Jemiolo et al, 1995, Proc. Nat. Acad. Sci. 92:12309-12313).

Materials and Methods Small Molecules Involved in Modulating TranslationTermination

Small molecules involved in modulating translation termination, i.e.,nonsense suppression, were used in the footprinting experimentspresented in FIGS. 2 to 6 and are listed as Compound A (molecularformula C₁₉H₂₁NO₄), Compound B (molecular formula C₁₉H₂₁N₂O₅), CompoundC (molecular formula C₁₂H₁₅N₅O), Compound D (molecular formulaC₂₃H₁₅O₃Br), Compound E (molecular formula C₁₉H₂₁NO₄), Compound F,Compound G (molecular formula C₁₂H₁₅N₅O), Compound H (molecular formulaC₂₃H₁₅NO₅), Compound I (molecular formula C₂₃H₁₅NO₅), Compound J, andCompound K.

Preparation of a Translation Extract from HeLa Cells

HeLa S3 cells were grown to a density of 10⁶ cells/ml in DMEM; 5% CO₂,10% FBS, 1×P/S in a spinner flask. Cells were harvested by spinning at1000×g. Cells were washed twice with phosphate buffered saline. The cellpellet was on ice for 12-24 hours before proceeding. By letting thecells sit on ice, the activity of the extract is increased up totwenty-fold. The length of time on ice can range from 0 hours to 1 week.The cells were resuspended in 1.5 volumes (packed cell volume) ofhypotonic buffer (10 mM HEPES (KOH) pH 7.4; 15 mM KCl; 1.5 mM Mg(OAc) 2;0.5 mM Pefabloc (Roche); 2 mM DTT). The cells were allowed to swell for5 minutes on ice, dounce homogenized with 10 to 100 strokes using atight-fitting pestle, and spun for 10 minutes at 12000×g at 4° C. in aSorvall SS-34 rotor. The supernatant was collected with a Pasteur pipetwithout disturbing the lipid layer, transferred into Eppendorf tubes (50to 200 ml aliquots), and immediately frozen in liquid nitrogen.

Footprinting

Ribosomes prepared from HeLa cells were incubated with the smallmolecules (at a concentration of 100 μM), followed by treatment withchemical modifying agents (dimethyl sulfate [DMS] and kethoxal [KE]).Following chemical modification, rRNA was phenol-chloroform extracted,ethanol precipitated, analyzed in primer extension reactions usingend-labeled oligonucleotides hybridizing to different regions of therRNAs and resolved on 6% polyacrylamide gels. The probes used for primerextension cover the entire 18S (7 oligonucleotide primers), 28S (24oligonucleotide primers), and 5S (one primer) rRNAs are presented inTable 1 (also see, e.g., Gonzalez et al., 1985 Proc Natl Acad Sci USA.82(22):7666-70 and McCallum & Maden, 1985, Biochem. J. 232 (3):725-733). Controls in these experiments include DMSO (a control forchanges in rRNA accessibility induced by DMSO), paromomycin (a markerfor 18S rRNA binding), and anisomycin (a marker for 28S rRNA binding).TABLE 1 18S, 28S, and 5S rRNA primers  5S#1 AAAGCCTACAGCACCC SEQ ID NO.:4 28S#1 TACTGAGGGAATCCTGG SEQ ID NO.: 5 28S#2 TTACCACCCGCTTTGGG SEQ IDNO.: 6 28S#3 GGGGGCGGGAAAGATCC SEQ ID NO.: 7 28S#4 CCCCGAGCCACCTTCCC SEQID NO.: 8 28S#5 GGCCCCGGGATTCGGCG SEQ ID NO.: 9 28S#6 CACTGGGGACAGTCCGCSEQ ID NO.: 10 28S#7 CGCGGCGGGCGAGACGGG SEQ ID NO.: 11 28S#8GAGGGAAACTTCGGAGGG SEQ ID NO.: 12 28S#9 CATCGGGCGCCTTAACCC SEQ ID NO.:13 28S#10 CGACGCACACCACACGC SEQ ID NO.: 14 28S#11 CCAAGATCTGCACCTGC SEQID NO.: 15 28S#12 TTACCGCACTGGACGCG SEQ ID NO.: 16 28S#13GCCAGAGGCTGTTCACC SEQ ID NO.: 17 28S#14 TGGGGAGGGAGCGAGCGGCG SEQ ID NO.:18 28S#15 AAGGGCCCGGCTCGCGTCC SEQ ID NO.: 19 28S#16 AGGGCGGGGGGACGAACCGCSEQ ID NO.: 20 28S#17 TTAAACAGTCGGATTCCCCTGG SEQ ID NO.: 21 28S#18TTCATCCATTCATGCGCG SEQ ID NO.: 22 28S#19 AGTAGTGGTATTTCACCGG SEQ ID NO.:23 28S#20 ACGGGAGGTTTCTGTCC SEQ ID NO.: 24 28S#21 ACAATGATAGGAAGAGCCGSEQ ID NO.: 25 28S#22 AGGCGTTCAGTCATAATCCC SEQ ID NO.: 26 28S#23TCCGCACCGGACCCCGGTCC SEQ ID NO.: 27 28S#24 GGGCTAGTTGATTCGGCAGGTGAGTTGSEQ ID NO.: 28 18S#1 TCTCCGGAATCGAACCCT SEQ ID NO.: 29 18S#2 ATT ACCGCGGCTGCTGGC SEQ ID NO.: 30 18S#3 TTGGCAAATGCTTTCGC SEQ ID NO.: 31 18S#4CCGTCAATTCCTTTAAGTTTC SEQ ID NO.: 32 18S#5 AGGGCATCACAGACCTGTTAT SEQ IDNO.: 33 18S#6 CGACGGGCGGTGTGTAC SEQ ID NO.: 34 18S#7 CCGCAGGTTCACCTACGGSEQ ID NO.: 35

Results

The results of these foot-printing experiments (see, e.g., FIGS. 7 to11) indicated that the small molecules involved in modulatingtranslation termination alter the accessibility of the chemicalmodifying agents to specific nucleotides in the 28S rRNA. Morespecifically, the regions protected by the small molecules include aconserved region in the vicinity of the peptidyl transferase center(domain V, see, e.g., FIGS. 7 and 8) implicated in peptide bondformation and a conserved region in domain II (see, e.g. FIGS. 9, 10,and 11) that may interact with the peptidyl transferase center based onbinding of vernamycin B to both these areas (Vannuffel et al., 1994,Nucleic Acids Res. 22(21):4449-53).

6.4 Example High Throughput Identification of Compounds Using Arrays

To identify molecules of the invention, high throughput assays thatenable each compound to be screened against many different nucleic acidsin a parallel manner are used. In brief, synthesis beads, with compoundsof the invention attached, are distributed into micro titer plates at adensity of one bead per well. Compounds of the invention that areattached to the beads are then released from the beads and dissolved ina small amount of solvent in each microtiter well. A high precisiontechnique, such as a robotic arrayer, is then used to transfer smallvolumes of solution containing dissolved compounds of the invention fromeach microtiter well, delivering the compounds to defined locations onglass slides. The glass slides are derivatized so that the compounds ofthe invention are immobilized on the surface of the slide. Each compoundcontains a functional group that allows for its immobilization on theglass slide. Each slide is then probed with a labeled RNA and bindingevents are detected by, e.g., a fluorescence-linked assay that is ableto detect the label.

6.5 Example Human Disease Genes Sorted by Chromosome

TABLE 2 Genes, Locations and Genetic Disorders on Chromosome 1 Gene GDBAccession ID OMIM Link ABCA4 GDB: 370748 MACULAR DEGENERATION, SENILESTARGARDT DISEASE 1; STGD1 ATP BINDING CASSETTE TRANSPORTER; ABCRRETINITIS PIGMENTOSA-19; RP19 ABCD3 GDB: 131485 PEROXISOMAL MEMBRANEPROTEIN 1; PXMP1 ACADM GDB: 118958 ACYL-CoA DEHYDROGENASE, MEDIUM-CHAIN;ACADM AGL GDB: 132644 GLYCOGEN STORAGE DISEASE III AGT GDB: 118750ANGIOTENSIN I; AGT ALDH4A1 GDB: 9958827 HYPERPROLINEMIA, TYPE II ALPLGDB: 118730 PHOSPHATASE, LIVER ALKALINE; ALPL HYPOPHOSPHATASIA,INFANTILE AMPD1 GDB: 119677 ADENOSINE MONOPHOSPHATE DEAMINASE-1; AMPD1APOA2 GDB: 119685 APOLIPOPROTEIN A-II; APOA2 AVSD1 GDB: 265302ATRIOVENTRICULAR SEPTAL DEFECT; AVSD BRCD2 GDB: 9955322 BREAST CANCER,DUCTAL, 2; BRCD2 C1QA GDB: 119042 COMPLEMENT COMPONENT 1, qSUBCOMPONENT, ALPHA POLYPEPTIDE; C1QA C1QB GDB: 119043 COMPLEMENTCOMPONENT 1, q SUBCOMPONENT, BETA POLYPEPTIDE; C1QB C1QG GDB: 128132COMPLEMENT COMPONENT 1, q SUBCOMPONENT, GAMMA POLYPEPTIDE; C1QG C8A GDB:119735 COMPLEMENT COMPONENT-8, DEFICIENCY OF C8B GDB: 119736 COMPLEMENTCOMPONENT-8, DEFICIENCY OF, TYPE II CACNA1S GDB: 126431 CALCIUM CHANNEL,VOLTAGE-DEPENDENT, L TYPE, ALPHA 1S SUBUNIT; CAGNA1S PERIODIC PARALYSISI MALIGNANT HYPERTHERMIA SUSCEPTIBILITY-5; MHS5 CCV GDB: 1336655CATARACT, CONGENITAL, VOLKMANN TYPE; CCV CD3Z GDB: 119766 CD3Z ANTIGEN,ZETA POLYPEPTIDE; CD3Z CDC2L1 GDB: 127827 PROTEIN KINASE p58; PK58 CHMLGDB: 135222 CHOROIDEREMIA-LIKE; CHML CHS1 GDB: 4568202 CHEDIAK-HIGASHISYNDROME; CHS1 CIAS1 GDB: 9957338 COLD HYPERSENSITIVITY URTICARIA,DEAFNESS, AND AMYLOIDOSIS CLCNKB GDB: 698472 CHLORIDE CHANNEL, KIDNEY,B; CLCNKB CMD1A GDB: 434478 CARDIOMYOPATHY, DILATED 1A; CMD1A CMH2 GDB:137324 CARDIOMYOPATHY, FAMILIAL HYPERTROPHIC, 2; CMH2 CMM GDB: 119059MELANOMA, MALIGNANT COL11A1 GDB: 120595 COLLAGEN, TYPE XI, ALPHA-1;COL11A1 COL9A2 GDB: 138310 COLLAGEN, TYPE IX, ALPHA-2 CHAIN; COL9A2EPIPHYSEAL DYSPLASIA, MULTIPLE, 2; EDM2 CPT2 GDB: 127272 MYOPATHY WITHDEFICIENCY OF CARNITINE PALMITOYLTRANSFERASE II HYPOGLYCEMIA,HYPOKETOTIC, WITH DEFICIENCY OF CARNITINE PALMITOYLTRANSFERASE CARNITINEPALMITOYLTRANSFERASE II; CPT2 CRB1 GDB: 333930 RETINITIS PIGMENTOSA-12;RP12 CSE GDB: 596182 CHOREOATHETOSIS/SPASTICITY, EPISODIC; CSE CSF3RGDB: 126430 COLONY STIMULATING FACTOR 3 RECEPTOR, GRANULOCYTE; CSF3RCTPA GDB: 9863168 CATARACT, POSTERIOR POLAR CTSK GDB: 453910PYCNODYSOSTOSIS CATHEPSIN K; CTSK DBT GDB: 118784 MAPLE SYRUP URINEDISEASE, TYPE 2 DIO1 GDB: 136449 THYROXINE DEIODINASE TYPE I; TXDI1DISC1 GDB: 9992707 DISORDER-2; SCZD2 DPYD GDB: 364102 DIHYDROPYRIMIDINEDEHYDROGENASE; DPYD EKV GDB: 119106 ERYTHROKERATODERMIA VARIABILIS; EKVENO1 GDB: 119871 PHOSPHOPYRUVATE HYDRATASE; PPH ENO1P GDB: 135006PHOSPHOPYRUVATE HYDRATASE; PPH EPB41 GDB: 119865 ERYTHROCYTE MEMBRANEPROTEIN BAND 4.1; EPB41 HEREDITARY HEMOLYTIC EPHX1 GDB: 119876 EPOXIDEHYDROLASE 1, MICROSOMAL; EPHX1 F13B GDB: 119893 FACTOR XIII, B SUBUNIT;F13B F5 GDB: 119896 FACTOR V DEFICIENCY FCGR2A GDB: 119903 Fc FRAGMENTOF IgG, LOW AFFINITY IIa, RECEPTOR FOR; FCGR2A FCGR2B GDB: 128183 FcFRAGMENT OF IgG, LOW AFFINITY IIa, RECEPTOR FOR; FCGR2A FCGR3A GDB:119904 Fc FRAGMENT OF IgG, LOW AFFINITY IIIa, RECEPTOR FOR; FCGR3A FCHLGDB: 9837503 HYPERLIPIDEMIA, COMBINED FH GDB: 119133 FUMARATE HYDRATASE;FH LEIOMYOMATA, HEREDITARY MULTIPLE, OF SKIN FMO3 GDB: 135136FLAVIN-CONTAINING MONOOXYGENASE 3; FMO3 TRIMETHYLAMINURIA FMO4 GDB:127981 FLAVIN-CONTAINING MONOOXYGENASE 2; FMO2 FUCA1 GDB: 119237FUCOSIDOSIS FY GDB: 119242 BLOOD GROUP - DUFFY SYSTEM; Fy GALE GDB:119245 GALACTOSE EPIMERASE DEFICIENCY GBA GDB: 119262 GAUCHER DISEASE,TYPE I; GD I GFND GDB: 9958222 GLOMERULAR NEPHRITIS, FAMILIAL, WITHFIBRONECTIN DEPOSITS GJA8 GDB: 696369 CATARACT, ZONULAR PULVERULENT 1;CZP1 GAP JUNCTION PROTEIN, ALPHA-8, 50-KD; GJA8 GJB3 GDB: 127820ERYTHROKERATODERMIA VARIABILIS; EKV DEAFNESS, AUTOSOMAL DOMINANTNONSYNDROMIC SENSORINEURAL, 2; DFNA2 GLC3B GDB: 3801939 GLAUCOMA 3,PRIMARY INFANTILE, B; GLC3B HF1 GDB: 120041 H FACTOR 1; HF1 HMGCL GDB:138445 HYDROXYMETHYLGLUTARICACIDURIA; HMGCL HPC1 GDB: 5215209 PROSTATECANCER; PRCA1 PROSTATE CANCER, HEREDITARY 1 HRD GDB: 9862254HYPOPARATHYROIDISM WITH SHORT STATURE, MENTAL RETARDATION, AND SEIZURESHRPT2 GDB: 125253 HYPERPARATHYROIDISM, FAMILIAL PRIMARY, WITH MULTIPLEOSSIFYING JAW HSD3B2 GDB: 134044 ADRENAL HYPERPLASIA II HSPG2 GDB:126372 HEPARAN SULFATE PROTEOGLYCAN OF BASEMENT MEMBRANE; HSPG2 MYOTONICMYOPATHY, DWARFISM, CHONDRODYSTROPHY, AND OCULAR AND FACIAL KCNQ4 GDB:439046 DEAFNESS, AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 2; DFNA2KCS GDB: 9848740 KENNY-CAFFEY SYNDROME, RECESSIVE FORM KIF1B GDB: 128645CHARCOT-MARIE-TOOTH DISEASE, NEURONAL TYPE, A; CMT2A LAMB3 GDB: 251820LAMININ, BETA 3; LAMB3 LAMC2 GDB: 136225 LAMININ, GAMMA 2; LAMC2EPIDERMOLYSIS BULLOSA LETALIS LGMD1B GDB: 231606 MUSCULAR DYSTROPHY,LIMB-GIRDLE, TYPE 1B; LGMD1B LMNA GDB: 132146 LAMIN A/C; LMNALIPODYSTROPHY, FAMILIAL PARTIAL, DUNNIGAN TYPE; LDP1 LOR GDB: 132049LORICRIN; LOR MCKD1 GDB: 9859381 POLYCYSTIC KIDNEYS, MEDULLARY TYPE MCL1GDB: 139137 MYELOID CELL LEUKEMIA 1; MCL1 MPZ GDB: 125266 HYPERTROPHICNEUROPATHY OF DEJERINE-SOTTAS MYELIN PROTEIN ZERO; MPZ MTHFR GDB: 3708825,10-@METHYLENETETRAHYDROFOLATE REDUCTASE; MTHFR MTR GDB: 119440METHYLTETRAHYDROFOLATE:L-HOMOCYSTEINE S-METHYLTRANSFERASE; MTR MUTYHGDB: 9315115 ADENOMATOUS POLYPOSIS OF THE COLON; APC MYOC GDB: 5584221GLAUCOMA 1, OPEN ANGLE; GLC1A MYOCILIN; MYOC NB GDB: 9958705NEUROBLASTOMA; NB NCF2 GDB: 120223 GRANULOMATOUS DISEASE, CHRONIC,AUTOSOMAL CYTOCHROME-b-POSITIVE FORM NEM1 GDB: 127387 NEMALINE MYOPATHY1, AUTOSOMAL DOMINANT; NEM1 NPHS2 GDB: 9955617 ARRHYTHMOGENIC RIGHTVENTRICULAR DYSPLASIA, FAMILIAL, 2; ARVD2 NPPA GDB: 118727 NATRIURETICPEPTIDE PRECURSOR A; NPPA NRAS GDB: 119457 ONCOGENE NRAS; NRAS; NRAS1NTRK1 GDB: 127897 ONCOGENE TRK NEUROTROPHIC TYROSINE KINASE, RECEPTOR,TYPE 1; NTRK1 NEUROPATHY, CONGENITAL SENSORY, WITH ANHIDROSIS OPTA2 GDB:9955577 OSTEOPETROSIS, AUTOSOMAL DOMINANT, TYPE II; OPA2 PBX1 GDB:125351 PRE-B-CELL LEUKEMIA TRANSCRIPTION FACTOR-1; PBX1 PCHC GDB:9955586 PHEOCHROMOCYTOMA PGD GDB: 119486 6-@PHOSPHOGLUCONATEDEHYDROGENASE, ERYTHROCYTE PHA2A GDB: 9955628 PSEUDOHYPOALDOSTERONISM,TYPE II; PHA2 PHGDH GDB: 9958261 3-@PHOSPHOGLYCERATE DEHYDROGENASEDEFICIENCY PKLR GDB: 120294 PYRUVATE KINASE DEFICIENCY OF ERYTHROCYTEPKP1 GDB: 4249598 PLAKOPHILIN 1; PKP1 PLA2G2A GDB: 120296 PHOSPHOLIPASEA2, GROUP IIA; PLA2G2A PLOD GDB: 127821 PROCOLLAGEN-LYSINE,2-OXOGLUTARATE 5-DIOXYGENASE; PLOD EHLERS-DANLOS SYNDROME, TYPE VI; E-DVI; EDS VI PPOX GDB: 118852 PROTOPORPHYRINOGEN OXIDASE; PPOX PPT GDB:125227 CEROID-LIPOFUSCINOSIS, NEURONAL 1, INFANTILE; CLN1PALMITOYL-PROTEIN THIOESTERASE; PPT PRCC GDB: 3888215 PAPILLARY RENALCELL CARCINOMA; PRCC PRG4 GDB: 9955719 ARTHROPATHY-CAMPTODACTYLYSYNDROME PSEN2 GDB: 633044 ALZHEIMER DISEASE, FAMILIAL, TYPE 4; AD4PTOS1 GDB: 6279920 PTOSIS, HEREDITARY CONGENITAL 1; PTOS1 REN GDB:120345 RENIN; REN RFX5 GDB: 6288464 REGULATORY FACTOR 5; RFX5 RHD GDB:119551 RHESUS BLOOD GROUP, D ANTIGEN; RHD RMD1 GDB: 448902 RIPPLINGMUSCLE DISEASE-1; RMD1 RPE65 GDB: 226519 RETINAL PIGMENTEPITHELIUM-SPECIFIC PROTEIN, 65-KD; RPE65 AMAUROSIS CONGENITA OF LEBERII SCCD GDB: 9955558 CORNEAL DYSTROPHY, CRYSTALLINE, OF SCHNYDERSERPINC1 GDB: 119024 ANTITHROMBIN III DEFICIENCY SJS1 GDB: 1381631MYOTONIC MYOPATHY, DWARFISM, CHONDRODYSTROPHY, AND OCULAR AND FACIALSLC19A2 GDB: 9837779 THIAMINE-RESPONSIVE MEGALOBLASTIC ANEMIA SYNDROMESLC2A1 GDB: 120627 SOLUTE CARRIER FAMILY 2, MEMBER 1; SLC2A1 SPTA1 GDB:119601 ELLIPTOCYTOSIS, RHESUS-UNLINKED TYPE HEREDITARY HEMOLYTICSPECTRIN, ALPHA, ERYTHROCYTIC 1; SPTA1 TAL1 GDB: 120759 T-CELL ACUTELYMPHOCYTIC LEUKEMIA 1; TAL1 TNFSF6 GDB: 422178 APOPTOSIS ANTIGEN LIGAND1; APT1LG1 TNNT2 GDB: 221879 TROPONIN-T2, CARDIAC; TNNT2 TPM3 GDB:127872 ONCOGENE TRK TROPOMYOSIN 3; TPM3 TSHB GDB: 120467THYROID-STIMULATING HORMONE, BETA CHAIN; TSHB UMPK GDB: 120481 URIDINEMONOPHOSPHATE KINASE; UMPK UOX GDB: 127539 URATE OXIDASE; UOX UROD GDB:119628 PORPHYRIA CUTANEA TARDA; PCT USH2A GDB: 120483 USHER SYNDROME,TYPE II; USH2 VMGLOM GDB: 9958134 GLOMUS TUMORS, MULTIPLE VWS GDB:120532 CLEFT LIP AND/OR PALATE WITH MUCOUS CYSTS OF LOWER LIP WS2B GDB:407579 WAARDENBURG SYNDROME, TYPE 2B; WS2B

TABLE 3 Genes, Locations and Genetic Disorders on Chromosome 2 Gene GDBAccession ID Location OMIM Link ABCB11 GDB: 9864786 2q24-2q24CHOLESTASIS, PROGRESSIVE 2q24.3-2q24.3 FAMILIAL INTRAHEPATIC 2; PFIC2ABCG5 GDB: 10450298 2p21-2p21 PHYTOSTEROLEMIA ABCG8 GDB: 104503002p21-2p21 PHYTOSTEROLEMIA ACADL GDB: 118745 2q34-2q35 ACYL-CoADEHYDROGENASE, LONG-CHAIN, DEFICIENCY OF ACP1 GDB: 118962 2p25-2p25PHOSPHATASE, ACID, OF ERYTHROCYTE; ACP1 AGXT GDB: 127113 2q37.3-2q37.3OXALOSIS I AHHR GDB: 118984 2pter-2q31 CYTOCHROME P450, SUBFAMILY I,POLYPEPTIDE 1; CYP1A1 ALMS1 GDB: 9865539 2p13-2p12 ALSTROM SYNDROME2p14-2p13 2p13.1-2p13.1 ALPP GDB: 119672 2q37.1-2q37.1 ALKALINEPHOSPHATASE, PLACENTAL; ALPP ALS2 GDB: 135696 2q33-2q35 AMYOTROPHICLATERAL SCLEROSIS 2, JUVENILE; ALS2 APOB GDB: 119686 2p24-2p23APOLIPOPROTEIN B; APOB 2p24-2p24 BDE GDB: 9955730 2q37-2q37BRACHYDACTYLY, TYPE E; BDE BDMR GDB: 533064 2q37-2q37BRACHYDACTYLY-MENTAL RETARDATION SYNDROME; BDMR BJS GDB: 99557172q34-2q36 TORTI AND NERVE DEAFNESS BMPR2 GDB: 642243 2q33-2q33 PULMONARY2q33-2q34 HYPERTENSION, PRIMARY; PPH1 BONE MORPHOGENETIC RECEPTOR TYPEII; BMPR2 CHRNA1 GDB: 120586 2q24-2q32 CHOLINERGIC RECEPTOR, NICOTINIC,ALPHA POLYPEPTIDE 1; CHRNA1 CMCWTD GDB: 11498919 2p22.3-2p21 FAMILIALCHRONIC MUCOCUTANEOUS, DOMINANT TYPE CNGA3 GDB: 434398 2q11.2-2q11.2COLORBLINDNESS, TOTAL CYCLIC NUCLEOTIDE GATED CHANNEL, OLFACTORY, 3;CNG3 COL3A1 GDB: 118729 2q31-2q32.3 COLLAGEN, TYPE III; COL3A12q32.2-2q32.2 EHLERS-DANLOS SYNDROME, TYPE IV, AUTOSOMAL DOMINANT COL4A3GDB: 128351 2q36-2q37 COLLAGEN, TYPE IV, ALPHA-3 CHAIN; COL4A3 COL4A4GDB: 132673 2q35-2q37 COLLAGEN, TYPE IV, ALPHA-4 CHAIN; COL4A4 COL6A3GDB: 119066 2q37.3-2q37.3 COLLAGEN, TYPE VI, ALPHA-3 CHAIN; COL6A3MYOPATHY, BENIGN CONGENITAL, WITH CONTRACTURES CPS1 GDB: 1197992q33-2q36 HYPERAMMONEMIA DUE TO 2q34-2q35 CARBAMOYLPHOSPHATE 2q35-2q35SYNTHETASE I DEFICIENCY CRYGA GDB: 119076 2q33-2q35 CRYSTALLIN, GAMMA A;CRYGA CRYGEP1 GDB: 119808 2q33-2q35 CRYSTALLIN, GAMMA A; CRYGA CYP1B1GDB: 353515 2p21-2p21 GLAUCOMA 3, PRIMARY 2p22-2p21 INFANTILE, A; GLC3A2pter-2qter CYTOCHROME P450, SUBFAMILY I (DIOXIN-INDUCIBLE), POLYPEPTIDE1; CYP1B1 CYP27A1 GDB: 128129 2q33-2qter CEREBROTENDINOUS XANTHOMATOSISDBI GDB: 119837 2q12-2q21 DIAZEPAM BINDING INHIBITOR; DBI DES GDB:119841 2q35-2q35 DESMIN; DES DYSF GDB: 340831 2p-2p MUSCULAR DYSTROPHY,2p13-2p13 LIMB-GIRDLE, TYPE 2B; 2pter-2p12 LGMD2B MUSCULAR DYSTROPHY,LATE-ONSET DISTAL EDAR GDB: 9837372 2q11-2q13 DYSPLASIA, HYPOHIDROTICECTODERMAL DYSPLASIA, ANHIDROTIC EFEMP1 GDB: 1220111 2p16-2p16 DOYNEHONEYCOMB DEGENERATION OF RETINA FIBRILLIN-LIKE; FBNL EIF2AK3 GDB:9956743 2p12-2p12 EPIPHYSEAL DYSPLASIA, MULTIPLE, WITH EARLY-ONSETDIABETES MELLITUS ERCC3 GDB: 119881 2q21-2q21 EXCISION-REPAIR,COMPLEMENTING DEFECTIVE, IN CHINESE HAMSTER; 3; ERCC3 FSHR GDB: 1275102p21-2p16 FOLLICLE-STIMULATING HORMONE RECEPTOR; FSHR GONADALDYSGENESIS, XX TYPE GAD1 GDB: 119244 2q31-2q31 PYRIDOXINE DEPENDENCYWITH SEIZURES GINGF GDB: 9848875 2p21-2p21 GINGIVAL SON OF SEVENLESS(DROSOPHILA) HOMOLOG 1; SOS1 GLC1B GDB: 1297553 2q1-2q13 GLAUCOMA 1,OPEN ANGLE, B; GLC1B GPD2 GDB: 354558 2q24.1-2q24.1 GLYCEROL-3-PHOSPHATEDEHYDROGENASE-2; GPD2 GYPC GDB: 120027 2q14-2q21 BLOOD GROUP - GERBICH;Ge HADHA GDB: 434026 2p23-2p23 HYDROXYACYL-CoA DEHYDROGENASE/3-KETOACYL-CoA THIOLASE/ENOYL-CoA HYDRATASE, HADHB GDB: 344953 2p23-2p23HYDROXYACYL-CoA DEHYDROGENASE/3-KETOACYL- CoA THIOLASE/ENOYL-CoAHYDRATASE, HOXD13 GDB: 127225 2q31-2q31 HOMEO BOX-D13; HOXD13SYNDACTYLY, TYPE II HPE2 GDB: 136066 2p21-2p21 MIDLINE CLEFT SYNDROMEIGKC GDB: 120088 2p12-2p12 IMMUNOGLOBULIN KAPPA 2p11.2-2p11.2 CONSTANTREGION; IGKC IHH GDB: 511203 2q33-2q35 BRACHYDACTYLY, TYPE A1; 2q35-2q35BDA1 INDIAN HEDGEHOG, 2pter-2qter DROSOPHILA, HOMOLOG OF; IHH IRS1 GDB:133974 2q36-2q36 INSULIN RECEPTOR SUBSTRATE 1; IRS1 ITGA6 GDB: 1280272pter-2qter INTEGRIN, ALPHA-6; ITGA6 KHK GDB: 391903 2p23.3-2p23.2FRUCTOSURIA KYNU GDB: 9957925 2q22.2-2q23.3 LCT GDB: 120140 2q21-2q21DISACCHARIDE INTOLERANCE II LHCGR GDB: 125260 2p21-2p21 LUTEINIZINGHORMONE/CHORIOGONADOTROPIN RECEPTOR; LHCGR LSFC GDB: 9956219 2-22p16-2p16 CYTOCHROME c OXIDASE DEFICIENCY, FRENCH-CANADIAN TYPE MSH2GDB: 203983 2p16-2p16 COLON CANCER, FAMILIAL, 2p22-2p21 NONPOLYPOSISTYPE 1; FCC1 MSH6 GDB: 632803 2p16-2p16 G/T MISMATCH-BINDING PROTEIN;GTBP NEB GDB: 120224 2q24.1-2q24.2 NEBULIN; NEB NEMALINE MYOPATHY 2,AUTOSOMAL RECESSIVE; NEM2 NMTC GDB: 11498336 2q21-2q21 THYROIDCARCINOMA, PAPILLARY NPHP1 GDB: 128050 2q13-2q13 NEPHRONOPHTHISIS,FAMILIAL JUVENILE 1; NPHP1 PAFAH1P1 GDB: 435099 2p11.2-2p11.2PLATELET-ACTIVATING FACTOR ACETYLHYDROLASE, GAMMA SUBUNIT PAX3 GDB:120495 2q36-2q36 KLEIN-WAARDENBURG 2q35-2q35 SYNDROME WAARDENBURGSYNDROME; WS1 PAX8 GDB: 136447 2q12-2q14 PAIRED BOX HOMEOTIC GENE 8;PAX8 PMS1 GDB: 386403 2q31-2q33 POSTMEIOTIC SEGREGATION INCREASED (S.CEREVISIAE)- 1; PMS1 PNKD GDB: 5583973 2q33-2q35 CHOREOATHETOSIS,FAMILIAL PAROXYSMAL; FPD1 PPH1 GDB: 1381541 2q31-2q32 PULMONARY2q33-2q33 HYPERTENSION, PRIMARY; PPH1 PROC GDB: 120317 2q13-2q21 PROTEINC DEFICIENCY, 2q13-2q14 CONGENITAL THROMBOTIC DISEASE DUE TO REG1A GDB:132455 2p12-2p12 REGENERATING ISLET-DERIVED 1-ALPHA; REG1A SAG GDB:120365 2q37.1-2q37.1 S-ANTIGEN; SAG SFTPB GDB: 120374 2p12-2p11.2SURFACTANT-ASSOCIATED PROTEIN, PULMONARY-3; SFTP3 SLC11A1 GDB: 3714442q35-2q35 CIRRHOSIS, PRIMARY; PBC NATURAL RESISTANCE-ASSOCIATEDMACROPHAGE PROTEIN 1; NRAMP1 SLC3A1 GDB: 202968 2p16.3-2p16.3 SOLUTECARRIER FAMILY 3, 2p21-2p21 MEMBER 1; SLC3A1 CYSTINURIA; CSNU SOS1 GDB:230004 2p22-2p21 GINGIVAL SON OF SEVENLESS (DROSOPHILA) HOMOLOG 1; SOS1SPG4 GDB: 230127 2p24-2p21 SPASTIC PARAPLEGIA-4, AUTOSOMAL DOMINANT;SPG4 SRD5A2 GDB: 127343 2p23-2p23 PSEUDOVAGINAL PERINEOSCROTALHYPOSPADIAS; PPSH TCL4 GDB: 136378 2q34-2q34 T-CELL LEUKEMIA/LYMPHOMA-4;TCL4 TGFA GDB: 120435 2p13-2p13 TRANSFORMING GROWTH FACTOR, ALPHA; TGFATMD GDB: 9837196 2q31-2q31 TIBIAL MUSCULAR DYSTROPHY, TARDIVE TPO GDB:120446 2p25-2p25 THYROID 2p25-2p24 HORMONOGENESIS, GENETIC DEFECT IN,IIA UGT1 GDB: 120007 2q37-2q37 UDP GLUCURONOSYLTRANSFERASE 1 FAMILY, A1;UGT1A1 UV24 GDB: 9955737 2pter-2qter UV-DAMAGE, EXCISION REPAIR OF,UV-24 WSS GDB: 9955707 2q32-2q32 WRINKLY SKIN SYNDROME; WSS XDH GDB:266386 2p23-2p22 XANTHINURIA ZAP70 GDB: 433738 2q11-2q13 SYK-RELATEDTYROSINE 2q12-2q12 KINASE; SRK ZFHX1B GDB: 9958310 2q22-2q22 DISEASE,MICROCEPHALY, AND IRIS COLOBOMA

TABLE 4 Genes, Locations and Genetic Disorders on Chromosome 3 Gene GDBAccession ID Location OMIM Link ACAA1 GDB: 119643 3p23-3p22 PEROXISOMAL3-OXOACYL-COENZYME A THIOLASE DEFICIENCY AGTR1 GDB: 132359 3q21-3q25ANGIOTENSIN II RECEPTOR, VASCULAR TYPE 1; AT2R1 AHSG GDB: 1189853q27-3q27 ALPHA-2-HS-GLYCOPROTEIN; AHSG AMT GDB: 132138 3p21.3-3p21.2HYPERGLYCINEMIA, ISOLATED 3p21.2-3p21.1 NONKETOTIC, TYPE II; NKH2 ARPGDB: 9959049 3p21.1-3p21.1 ARGININE-RICH PROTEIN BBS3 GDB: 376501 3p-3pBARDET-BIEDL SYNDROME, 3p12.3-3q11.1 TYPE 3; BBS3 BCHE GDB: 1205583q26.1-3q26.2 BUTYRYLCHOLINESTERASE; BCHE BCPM GDB: 433809 3q21-3q21BENIGN CHRONIC PEMPHIGUS; BCPM BTD GDB: 309078 3p25-3p25 BIOTINIDASE;BTD CASR GDB: 134196 3q21-3q24 HYPOCALCIURIC HYPERCALCEMIA, FAMILIAL;HHC1 CCR2 GDB: 337364 3p21-3p21 CHEMOKINE (C—C) RECEPTOR 2; CMKBR2 CCR5GDB: 1230510 3p21-3p21 CHEMOKINE (C—C) RECEPTOR 5; CMKBR5 CDL1 GDB:136344 3q26.3-3q26.3 DE LANGE SYNDROME; CDL CMT2B GDB: 604021 3q13-3q22CHARCOT-MARIE-TOOTH DISEASE, NEURONAL TYPE, B; CMT2B COL7A1 GDB: 1287503p21-3p21 COLLAGEN, TYPE VII, ALPHA-1; 3p21.3-3p21.3 COL7A1 CP GDB:119069 3q23-3q25 CERULOPLASMIN; CP 3q21-3q24 CRV GDB: 114983333p21.3-3p21.1 VASCULOPATHY, RETINAL, WITH CEREBRAL LEUKODYSTROPHY CTNNB1GDB: 141922 3p22-3p22 CATENIN, BETA 1; CTNNB1 3p21.3-3p21.3 DEM GDB:681157 3p12-3q11 DEMENTIA, FAMILIAL NONSPECIFIC; DEM ETM1 GDB: 97325233q13-3q13 TREMOR, HEREDITARY ESSENTIAL 1; ETM1 FANCD2 GDB: 6983453p25.3-3p25.3 FANCONI PANCYTOPENIA, 3pter-3p24.2 COMPLEMENTATION GROUP DFIH GDB: 9955790 3q13-3q13 HYPOPARATHYROIDISM, FAMILIAL ISOLATED; FIHFOXL2 GDB: 129025 3q23-3q23 BLEPHAROPHIMOSIS, 3q22-3q23 EPICANTHUSINVERSUS, AND PTOSIS; BPES GBE1 GDB: 138442 3p12-3p12 GLYCOGEN STORAGEDISEASE IV GLB1 GDB: 119987 3p22-3p21.33 GANGLIOSIDOSIS, 3p21.33-3p21.33GENERALIZED GM1, TYPE I GLC1C GDB: 3801941 3q21-3q24 GLAUCOMA 1, OPENANGLE, C; GLC1C GNAI2 GDB: 120516 3p21.3-3p21.2 GUANINENUCLEOTIDE-BINDING PROTEIN, ALPHA-INHIBITING, POLYPEPTIDE-2; GNAT1 GDB:119277 3p21.3-3p21.2 GUANINE NUCLEOTIDE-BINDING PROTEIN,ALPHA-TRANSDUCING, POLYPEPTIDE GP9 GDB: 126370 3pter-3qter PLATELETGLYCOPROTEIN IX; GP9 GPX1 GDB: 119282 3q11-3q12 GLUTATHIONE PEROXIDASE;3p21.3-3p21.3 GPX1 HGD GDB: 203935 3q21-3q23 ALKAPTONURIA; AKU HRG GDB:120055 3q27-3q27 HISTIDINE-RICH GLYCOPROTEIN; HRG; HRGP ITIH1 GDB:120107 3p21.2-3p21.1 INTER-ALPHA-TRYPSIN INHIBITOR, HEAVY CHAIN-1;ITIH1; IATIH; ITIH KNG GDB: 125256 3q27-3q27 FLAUJEAC FACTOR DEFICIENCYLPP GDB: 1391795 3q27-3q28 LIM DOMAIN-CONTAINING PREFERRED TRANSLOCATIONPARTNER IN LIPOMA; LPP LRS1 GDB: 682448 3p21.1-3p14.1 LARSEN SYNDROME,AUTOSOMAL DOMINANT; LRS1 MCCC1 GDB: 135989 3q27-3q27BETA-METHYLCROTONYLGLY 3q25-3q27 CINURIA I MDS1 GDB: 250411 3q26-3q26MYELODYSPLASIA SYNDROME 1; MDS1 MHS4 GDB: 574245 3q13.1-3q13.1HYPERTHERMIA SUSCEPTIBILITY-4; MHS4 MITF GDB: 214776 3p14.1-3p12MICROPHTHALMIA-ASSOCIATED TRANSCRIPTION FACTOR; MITF WAARDENBURGSYNDROME, TYPE II; WS2 MLH1 GDB: 249617 3p23-3p22 COLON CANCER,FAMILIAL, 3p21.3-3p21.3 NONPOLYPOSIS TYPE 2; FCC2 MYL3 GDB: 1202183p21.3-3p21.2 MYOSIN, LIGHT CHAIN, ALKALI, VENTRICULAR AND SKELETALSLOW; MYL3 MYMY GDB: 11500610 3p26-3p24.2 DISEASE OPA1 GDB: 1188483q28-3q29 OPTIC ATROPHY 1; OPA1 PBXP1 GDB: 125352 3q22-3q23 PRE-B-CELLLEUKEMIA TRANSCRIPTION FACTOR-1; PBX1 PCCB GDB: 119474 3q21-3q22GLYCINEMIA, KETOTIC, II POU1F1 GDB: 129070 3p11-3p11 POU DOMAIN, CLASS1, TRANSCRIPTION FACTOR 1; POU1F1 PPARG GDB: 1223810 3p25-3p25 CANCER OFCOLON PEROXISOME PROLIFERATOR ACTIVATED RECEPTOR, GAMMA; PPARG PROS1GDB: 120721 3p11-3q11 PROTEINS, ALPHA; PROS1 3p11.1-3q11.2 PTHR1 GDB:138128 3p22-3p21.1 METAPHYSEAL CHONDRODYSPLASIA, MURK JANSEN TYPEPARATHYROID HORMONE RECEPTOR 1; PTHR1 RCA1 GDB: 230233 3p14.2-3p14.2RENAL CARCINOMA, FAMILIAL, ASSOCIATED 1; RCA1 RHO GDB: 1203473q21.3-3q24 RHODOPSIN; RHO SCA7 GDB: 454471 3p21.1-3p12 SPINOCEREBELLARATAXIA 7; SCA7 SCLC1 GDB: 9955750 3p23-3p21 SMALL-CELL CANCER OF THELUNG; SCCL SCN5A GDB: 132152 3p21-3p21 SODIUM CHANNEL, VOLTAGE-GATED,TYPE V, ALPHA POLYPEPTIDE; SCN5A SI GDB: 120377 3q25.2-3q26.2DISACCHARIDE INTOLERANCE I SLC25A20 GDB: 6503297 3p21.31-3p21.31CARNITINE-ACYLCARNITINE TRANSLOCASE; CACT SLC2A2 GDB: 119995 3q26.2-3q27SOLUTE CARRIER FAMILY 2, 3q26.1-3q26.3 MEMBER 2; SLC2A2 FANCONI-BICKELSYNDROME; FBS TF GDB: 120432 3q21-3q21 TRANSFERRIN; TF TGFBR2 GDB:224909 3p22-3p22 TRANSFORMING GROWTH 3pter-3p24.2 FACTOR-BETA RECEPTOR,TYPE II; TGFBR2 THPO GDB: 374007 3q26.3-3q27 THROMBOPOIETIN; THPO THRBGDB: 120731 3p24.1-3p22 THYROID HORMONE 3p24.3-3p24.3 RECEPTOR, BETA;THRB TKT GDB: 132402 3p14.3-3p14.3 WERNICKE-KORSAKOFF SYNDROME TM4SF1GDB: 250815 3q21-3q25 TUMOR-ASSOCIATED ANTIGEN L6; TAAL6 TRH GDB: 1280723pter-3qter THYROTROPIN-RELEASING HORMONE DEFICIENCY UMPS GDB: 1204823q13-3q13 OROTICACIDURIA I UQCRC1 GDB: 141850 3p21.3-3p21.2UBIQUINOL-CYTOCHROME c 3p21.3-3p21.3 REDUCTASE CORE PROTEIN I; UQCRC1USH3A GDB: 392645 3q21-3q25 USHER SYNDROME, TYPE III; USH3 VHL GDB:120488 3p26-3p25 VON HIPPEL-LINDAU SYNDROME; VHL WS2A GDB: 1280533p14.2-3p13 MICROPHTHALMIA-ASSOCIATED TRANSCRIPTION FACTOR; MITFWAARDENBURG SYNDROME, TYPE II; WS2 XPC GDB: 134769 3p25.1-3p25.1XERODERMA PIGMENTOSUM, COMPLEMENTATION GROUP C; XPC ZNF35 GDB: 1205073p21-3p21 ZINC FINGER PROTEIN-35; ZNF35

TABLE 5 Genes, Locations and Genetic Disorders on Chromosome 4 Gene GDBAccession ID Location OMIM Link ADH1B GDB: 119651 4q21-4q23 ALCOHOL4q22-4q22 DEHYDROGENASE-2; ADH2 ADH1C GDB: 119652 4q21-4q23 ALCOHOL4q22-4q22 DEHYDROGENASE-3; ADH3 AFP GDB: 119660 4q11-4q13ALPHA-FETOPROTEIN; AFP AGA GDB: 118981 4q23-4q35 ASPARTYLGLUCOSAMINURIA;4q32-4q33 AGU AIH2 GDB: 118751 4q11-4q13 AMELOGENESIS IMPERFECTA4q13.3-4q21.2 2, HYPOPLASTIC LOCAL, AUTOSOMAL DOMINANT; ALB GDB: 1189904q11-4q13 ALBUMIN; ALB ASMD GDB: 119705 4q-4q 4q28-4q31 ANTERIOR SEGMENTOCULAR DYSGENESIS; ASOD BFHD GDB: 11498907 4q34.1-4q35 DYSPLASIA, BEUKESTYPE CNGA1 GDB: 127557 4p14-4q13 CYCLIC NUCLEOTIDE GATED CHANNEL,PHOTORECEPTOR, cGMP GATED, 1; CNCG1 CRBM GDB: 9958132 4p16.3-4p16.3CHERUBISM DCK GDB: 126810 4q13.3-4q21.1 DEOXYCYTIDINE KINASE; DCK DFNA6GDB: 636175 4p16.3-4p16.3 DEAFNESS, AUTOSOMAL DOMINANT NONSYNDROMICSENSORINEURAL, 6; DFNA6 DSPP GDB: 5560457 4pter-4qter DENTINPHOSPHOPROTEIN; 4q21.3-4q21.3 DPP DENTINOGENESIS IMPERFECTA; DGI1 DTDP2GDB: 9955810 4q-4q DENTIN DYSPLASIA, TYPE II ELONG GDB: 114987004q24-4q24 ENAM GDB: 9955259 4q21-4q21 AMELOGENESIS IMPERFECTA 2,HYPOPLASTIC LOCAL, AUTOSOMAL DOMINANT; AMELOGENESIS IMPERFECTA,HYPOPLASTIC TYPE ETFDH GDB: 135992 4q32-4q35 GLUTARICACIDURIA IIC; GAIIC EVC GDB: 555573 4p16-4p16 ELLIS-VAN CREVELD SYNDROME; EVC F11 GDB:119891 4q35-4q35 PTA DEFICIENCY FABP2 GDB: 119127 4q28-4q31 FATTY ACIDBINDING PROTEIN 2, INTESTINAL; FABP2 FGA GDB: 119129 4q28-4q28AMYLOIDOSIS, FAMILIAL VISCERAL FIBRINOGEN, A ALPHA POLYPEPTIDE; FGA FGBGDB: 119130 4q28-4q28 FIBRINOGEN, B BETA POLYPEPTIDE; FGB FGFR3 GDB:127526 4p16.3-4p16.3 ACHONDROPLASIA; ACH BLADDER CANCER FIBROBLASTGROWTH FACTOR RECEPTOR-3; FGFR3 FGG GDB: 119132 4q28-4q28 FIBRINOGEN, GGAMMA POLYPEPTIDE; FGG FSHMD1A GDB: 119914 4q35-4q35 FACIOSCAPULOHUMERALMUSCULAR DYSTROPHY 1A; FSHMD1A GC GDB: 119263 4q12-4q13 GROUP-SPECIFIC4q12-4q12 COMPONENT; GC GNPTA GDB: 119280 4q21-4q23 MUCOLIPIDOSIS II;ML2; ML II GNRHR GDB: 136456 4q13-4q13 GONADOTROPIN-RELEASING4q21.2-4q21.2 HORMONE RECEPTOR; GNRHR GYPA GDB: 118890 4q28-4q31 BLOODGROUP - MN LOCUS; 4q28.2-4q31.1 MN HCA GDB: 9954675 4q33-4qterHYPERCALCIURIA, FAMILIAL IDIOPATHIC HCL2 GDB: 119305 4q28-4q31 4q-4qHAIR COLOR-2; HCL2 HD GDB: 119307 4p16.3-4p16.3 HUNTINGTON DISEASE; HDHTN3 GDB: 125601 4q12-4q21 HISTATIN-3; HTN3 HVBS6 GDB: 120687 4q32-4q32HEPATOCELLULAR CARCINOMA-2; HCC2 IDUA GDB: 119327 4p16.3-4p16.3MUCOPOLYSACCHARIDOSIS TYPE I; MPS I IF GDB: 120077 4q24-4q25 COMPLEMENTCOMPONENT-3 4q25-4q25 INACTIVATOR, DEFICIENCY OF JPD GDB: 1201134pter-4qter PERIODONTITIS, JUVENILE; 4q12-4q13 JPD KIT GDB: 1201174q12-4q12 V-KIT HARDY-ZUCKERMAN 4 FELINE SARCOMA VIRAL ONCOGENE HOMOLOG;KIT KLKB1 GDB: 127575 4q34-4q35 FLETCHER FACTOR 4q35-4q35 DEFICIENCYLQT4 GDB: 682072 4q25-4q27 SYNDROME WITHOUT PSYCHOMOTOR RETARDATIONMANBA GDB: 125261 4q21-4q25 MANNOSIDOSIS, BETA; MANB1 MLLT2 GDB: 1367924q21-4q21 MYELOID/LYMPHOID OR MIXED LINEAGE LEUKEMIA, TRANSLOCATED TO,2; MLLT2 MSX1 GDB: 120683 4p16.3-4p16.1 MSH, DROSOPHILA, HOMEO4p16.1-4p16.1 BOX, HOMOLOG OF, 1; MSX1 MTP GDB: 228961 4q24-4q24MICROSOMAL TRIGLYCERIDE TRANSFER PROTEIN, 88 KD; MTP NR3C2 GDB: 1201884q31-4q31 PSEUDOHYPOALDOSTERONISM, 4q31.1-4q31.1 TYPE I, AUTOSOMALRECESSIVE; PHA1 PBT GDB: 120260 4q12-4q21 PIEBALD TRAIT; PBT PDE6B GDB:125915 4p16.3-4p16.3 NIGHTBLINDNESS, CONGENITAL STATIONARY; CSNB3PHOSPHODIESTERASE 6B, cGMP-SPECIFIC, ROD, BETA; PDE6B PEE1 GDB: 70167654q31-4q34 1; PEE1 4q25-4qter PITX2 GDB: 134770 4q25-4q27IRIDOGONIODYSGENESIS, 4q25-4q26 TYPE 2; IRID2 RIEGER 4q25-4q25 SYNDROME,TYPE 1; RIEG1 RIEG BICOID-RELATED HOMEOBOX TRANSCRIPTION FACTOR 1; RIEG1HOMEO BOX 2 PKD2 GDB: 118851 4q21-4q23 POLYCYSTIC KIDNEY DISEASE 2; PKD2QDPR GDB: 120331 4p15.3-4p15.3 PHENYLKETONURIA II 4p15.31-4p15.31 SGCBGDB: 702072 4q12-4q12 MUSCULAR DYSTROPHY, LIMB-GIRDLE, TYPE 2E; LGMD2ESLC25A4 GDB: 119680 4q35-4q35 ADENINE NUCLEOTIDE TRANSLOCATOR 1; ANT1PROGRESSIVE EXTERNAL OPHTHALMOPLEGIA; PEO SNCA GDB: 439047 4q21.3-4q22SYNUCLEIN, ALPHA; SNCA 4q21-4q21 PARKINSON DISEASE, FAMILIAL, TYPE 1;PARK1 SOD3 GDB: 125291 4p16.3-4q21 SUPEROXIDE DISMUTASE, EXTRACELLULAR;SOD3 STATH GDB: 120391 4q11-4q13 STATHERIN; STATH; STR TAPVR1 GDB:392646 4p13-4q11 ANOMALOUS PULMONARY VENOUS RETURN; APVR TYS GDB: 1196244q-4q SCLEROTYLOSIS; TYS WBS2 GDB: 132426 4q33-4q35.1 WILLIAMS-BEURENSYNDROME; WBS WFS1 GDB: 434294 4p-4p 4p16-4p16 DIABETES MELLITUS ANDINSIPIDUS WITH OPTIC ATROPHY AND DEAFNESS WHCR GDB: 125355 4p16.3-4p16.3WOLF-HIRSCHHORN SYNDROME; WHS

TABLE 6 Genes, Locations and Genetic Disorders on Chromosome 5 Gene GDBAccession ID OMIM Link ADAMTS2 GDB: 9957209 EHLERS-DANLOS SYNDROME, TYPEVII, AUTOSOMAL RECESSIVE ADRB2 GDB: 120541 BETA-2-ADRENERGIC RECEPTOR;ADRB2 AMCN GDB: 9836823 ARTHROGRYPOSIS MULTIPLEX CONGENITA, NEUROGENICTYPE AP3B1 GDB: 9955590 HERMANSKY-PUDLAK SYNDROME; HPS APC GDB: 119682ADENOMATOUS POLYPOSIS OF THE COLON; APC ARSB GDB: 119008MUCOPOLYSACCHARIDOSIS TYPE VI; MPS VI B4GALT7 GDB: 9957653 SYNDROME,PROGEROID FORM BHR1 GDB: 9956078 ASTHMA C6 GDB: 119045 COMPLEMENTCOMPONENT-6, DEFICIENCY OF C7 GDB: 119046 COMPLEMENT COMPONENT-7,DEFICIENCY OF CCAL2 GDB: 5584265 CHONDROCALCINOSIS, FAMILIAL ARTICULARCKN1 GDB: 128586 COCKAYNE SYNDROME, TYPE I; CKN1 CMDJ GDB: 9595425CRANIOMETAPHYSEAL DYSPLASIA, JACKSON TYPE; CMDJ CRHBP GDB: 127438CORTICOTROPIN RELEASING HORMONE-BINDING PROTEIN; CRHBP CSF1R GDB: 120600COLONY-STIMULATING FACTOR-1 RECEPTOR; CSF1R DHFR GDB: 119845DIHYDROFOLATE REDUCTASE; DHFR DIAPH1 GDB: 9835482 DEAFNESS, AUTOSOMALDOMINANT NONSYNDROMIC SENSORINEURAL, 1; DFNA1 DIAPHANOUS, DROSOPHILA,HOMOLOG OF, 1 DTR GDB: 119853 DIPHTHERIA TOXIN SENSITIVITY; DTS EOS GDB:9956083 EOSINOPHILIA, FAMILIAL ERVR GDB: 9835857 HYALOIDEORETINALDEGENERATION OF WAGNER F12 GDB: 119892 HAGEMAN FACTOR DEFICIENCY FBN2GDB: 128122 CONTRACTURAL ARACHNODACTYLY, CONGENITAL; CCA GDNF GDB:450609 GLIAL CELL LINE-DERIVED NEUROTROPHIC FACTOR; GDNF GHR GDB: 119984GROWTH HORMONE RECEPTOR; GHR GLRA1 GDB: 118801 GLYCINE RECEPTOR, ALPHA-1SUBUNIT; GLRA1 KOK DISEASE GM2A GDB: 120000 TAY-SACHS DISEASE, ABVARIANT HEXB GDB: 119308 SANDHOFF DISEASE HSD17B4 GDB: 38505917-@BETA-HYDROXYSTEROID DEHYDROGENASE IV; HSD17B4 ITGA2 GDB: 128031INTEGRIN, ALPHA-2; ITGA2 KFS GDB: 9958987 VERTEBRAL FUSION LGMD1A GDB:118832 MUSCULAR DYSTROPHY, LIMB-GIRDLE, TYPE 1A; LGMD1A LOX GDB: 119367LYSYL OXIDASE; LOX LTC4S GDB: 384080 LEUKOTRIENE C4 SYNTHASE; LTC4SMAN2A1 GDB: 136413 MANNOSIDASE, ALPHA, II; MANA2 DYSERYTHROPOIETICANEMIA, CONGENITAL, TYPE II MCC GDB: 128163 MUTATED IN COLORECTALCANCERS; MCC MCCC2 GDB: 135990 II MSH3 GDB: 641986 MutS, E. COLI,HOMOLOG OF, 3; MSH3 MSX2 GDB: 138766 MSH (DROSOPHILA) HOMEO BOX HOMOLOG2; MSX2 PARIETAL FORAMINA, SYMMETRIC; PFM NR3C1 GDB: 120017GLUCOCORTICOID RECEPTOR; GRL PCSK1 GDB: 128033 PROPROTEIN CONVERTASESUBTILISIN/KEXIN TYPE 1; PCSK1 PDE6A GDB: 120265 PHOSPHODIESTERASE 6A,cGMP-SPECIFIC, ROD, ALPHA; PDE6A PFBI GDB: 9956096 INTENSITY OFINFECTION IN RASA1 GDB: 120339 RAS p21 PROTEIN ACTIVATOR 1; RASA1 SCZD1GDB: 120370 DISORDER-1; SCZD1 SDHA GDB: 378037 SUCCINATE DEHYDROGENASECOMPLEX, SUBUNIT A, FLAVOPROTEIN; SDHA SGCD GDB: 5886421 SARCOGLYCAN,DELTA; SGCD SLC22A5 GDB: 9863277 CARNITINE DEFICIENCY, SYSTEMIC, DUE TODEFECT IN RENAL REABSORPTION SLC26A2 GDB: 125421 DIASTROPHIC DYSPLASIA;DTD EPIPHYSEAL DYSPLASIA, MULTIPLE; MED NEONATAL OSSEOUS DYSPLASIA IACHONDROGENESIS, TYPE IB; ACG1B SLC6A3 GDB: 132445 SOLUTE CARRIER FAMILY6, MEMBER 3; SLC6A3 DEFICIT-HYPERACTIVITY DISORDER; ADHD SM1 GDB:9834488 SCHISTOSOMA MANSONI SUSCEPTIBILITY/RESISTANCE SMA@ GDB: 120378SPINAL MUSCULAR ATROPHY I; SMA I SURVIVAL OF MOTOR NEURON 1, TELOMERIC;SMN1 SMN1 GDB: 5215173 SPINAL MUSCULAR ATROPHY I; SMA I SURVIVAL OFMOTOR NEURON 1, TELOMERIC; SMN1 SMN2 GDB: 5215175 SPINAL MUSCULARATROPHY I; SMA I SURVIVAL OF MOTOR NEURON 2, CENTROMERIC; SMN2 SPINK5GDB: 9956114 NETHERTON DISEASE TCOF1 GDB: 127390 TREACHERCOLLINS-FRANCESCHETTI SYNDROME 1; TCOF1 TGFBI GDB: 597601 CORNEALDYSTROPHY, GRANULAR TYPE CORNEAL DYSTROPHY, LATTICE TYPE I; CDL1TRANSFORMING GROWTH FACTOR, BETA-INDUCED, 68 KD; TGFBI

TABLE 7 Genes, Locations and Genetic Disorders on Chromosome 6 Gene GDBAccession ID OMIM Link ALDH5A1 GDB: 454767 SUCCINIC SEMIALDEHYDEDEHYDROGENASE, NAD(+)-DEPENDENT; SSADH ARG1 GDB: 119006 ARGININEMIA ASGDB: 135697 ANKYLOSING SPONDYLITIS; AS ASSP2 GDB: 119017 CITRULLINEMIABCKDHB GDB: 118759 MAPLE SYRUP URINE DISEASE, TYPE IB BF GDB: 119726GLYCINE-RICH BETA-GLYCOPROTEIN; GBG C2 GDB: 119731 COMPLEMENTCOMPONENT-2, DEFICIENCY OF C4A GDB: 119732 COMPLEMENT COMPONENT 4A; C4ACDKN1A GDB: 266550 CYCLIN-DEPENDENT KINASE INHIBITOR 1A CDKN1A COL10A1GDB: 128635 COLLAGEN, TYPE X, ALPHA 1; COL10A1 COL11A2 GDB: 119788COLLAGEN, TYPE XI, ALPHA-2; COL11A2 STICKLER SYNDROME, TYPE II; STL2DEAFNESS, AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 13; DFNA13CYP21A2 GDB: 120605 ADRENAL HYPERPLASIA, CONGENITAL, DUE TO21-HYDROXYLASE DEFICIENCY DYX2 GDB: 437584 DYSLEXIA, SPECIFIC, 2; DYX2EJM1 GDB: 119864 MYOCLONIC EPILEPSY, JUVENILE; EJM1 ELOVL4 GDB: 11499609STARGARDT DISEASE 3; STGD3 EPM2A GDB: 3763331 EPILEPSY, PROGRESSIVEMYOCLONIC 2; EPM2 ESR1 GDB: 119120 ESTROGEN RECEPTOR; ESR EYA4 GDB:700062 DEAFNESS, AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 10;DFNA10 F13A1 GDB: 120614 FACTOR XIII, A1 SUBUNIT; F13A1 FANCE GDB:1220236 FANCONI ANEMIA, COMPLEMENTATION GROUP E; FACE GCLC GDB: 132915GAMMA-GLUTAMYLCYSTEINE SYNTHETASE DEFICIENCY, HEMOLYTIC ANEMIA DUE GJA1GDB: 125196 GAP JUNCTION PROTEIN, ALPHA-1, 43 KD; GJA1 GLYS1 GDB: 136421GLYCOSURIA, RENAL GMPR GDB: 127058 GUANINE MONOPHOSPHATE REDUCTASE GSEGDB: 9956235 DISEASE; CD HCR GDB: 9993306 PSORIASIS, SUSCEPTIBILITY TOHFEGDB: 119309 HEMOCHROMATOSIS; HFE HLA-A GDB: 119310 HLA-AHISTOCOMPATIBILITY TYPE; HLAA HLA-DPB1GDB: 120636 LA-DPHISTOCOMPATIBILITY TYPE, BETA-1 SUBUNIT HLA-DRA GDB: 120641 HLA-DRHISTOCOMPATIBILITY TYPE; HLA-DRA HPFH GDB: 9849006 HETEROCELLULARHEREDITARY PERSISTENCE OF FETAL HEMOGLOBIN ICS1 GDB: 136433 IMMOTILECILIA SYNDROME-1; ICS1 IDDM1 GDB: 9953173 DIABETES MELLITUS,JUVENILE-ONSET INSULIN-DEPENDENT; IDDM IFNGR1 GDB: 120688 INTERFERON,GAMMA, RECEPTOR-1; IFNGR1 IGAD1 GDB: 6929077 SELECTIVE DEFICIENCY OFIGF2R GDB: 120083 INSULIN-LIKE GROWTH FACTOR 2 RECEPTOR; IGF2R ISCW GDB:9956158 SUPPRESSION; IS LAMA2 GDB: 132362 LAMININ, ALPHA 2; LAMA2 LAPGDB: 9958992 LARYNGEAL ADDUCTOR PARALYSIS; LAP LCA5 GDB: 11498764AMAUROSIS CONGENITA OF LEBER I LPA GDB: 120699 APOLIPOPROTEIN(a); LPAMCDR1 GDB: 131406 MACULAR DYSTROPHY, RETINAL, 1, NORTH CAROLINA TYPE;MCDR1 MOCS1 GDB: 9862235 MOLYBDENUM COFACTOR DEFICIENCY MUT GDB: 120204METHYLMALONICACIDURIA DUE TO METHYLMALONIC CoA MUTASE DEFICIENCY MYBGDB: 119441 V-MYB AVIAN MYELOBLASTOSIS VIRAL ONCOGENE HOMOLOG; MYB NEU1GDB: 120230 NEURAMINIDASE DEFICIENCY NKS1 GDB: 128100 SUSCEPTIBILITY TOLYSIS BY ALLOREACTIVE NATURAL KILLER CELLS; EC1 NYS2 GDB: 9848763NYSTAGMUS, CONGENITAL OA3 GDB: 136429 ALBINISM, OCULAR, AUTOSOMALRECESSIVE; OAR ODDD GDB: 6392584 OCULODENTODIGITAL DYSPLASIA; ODDD OFC1GDB: 120247 OROFACIAL CLEFT 1; OFC1 PARK2 GDB: 6802742 PARKINSONISM,JUVENILE PBCA GDB: 9956321 BETA CELL AGENESIS WITH NEONATAL DIABETESMELLITUS PBCRA1 GDB: 3763333 CHORIORETINAL ATROPHY, PROGRESSIVE BIFOCAL;CRAPB PDB1 GDB: 136349 DISEASE OF BONE; PDB PEX3 GDB: 9955507 ZELLWEGERSYNDROME; ZS PEX6 GDB: 5592414 ZELLWEGER SYNDROME; ZS PEROXIN-6; PEX6PEX7 GDB: 6155803 RHIZOMELIC CHONDRODYSPLASIA PUNCTATA; RCDP PEROXIN-7;PEX7 PKHD1 GDB: 433910 POLYCYSTIC KIDNEY AND HEPATIC DISEASE-1; PKHD1PLA2G7 GDB: 9958829 PLATELET-ACTIVATING FACTOR ACETYLHYDROLASE, SUBUNITPLG GDB: 119498 PLASMINOGEN; PLG POLH GDB: 6963323 PIGMENTOSUM WITHNORMAL DNA REPAIR RATES PPAC GDB: 9956248 ARTHROPATHY, PROGRESSIVEPSEUDORHEUMATOID, OF CHILDHOOD PSORS1 GDB: 6381310 PSORIASIS,SUSCEPTIBILITY TO PUJO GDB: 9956231 MULTICYSTIC RENAL DYSPLASIA,BILATERAL; MRD RCD1 GDB: 333929 RETINAL CONE DEGENERATION RDS GDB:118863 RETINAL DEGENERATION, SLOW; RDS RHAG GDB: 136011 RHESUS BLOODGROUP-ASSOCIATED GLYCOPROTEIN; RHAG RH-NULL, REGULATOR TYPE; RHN RP14GDB: 433713 RETINITIS PIGMENTOSA-14; RP14 TUBBY-LIKE PROTEIN 1; TULP1RUNX2 GDB: 392082 CLEIDOCRANIAL DYSPLASIA; CCD CORE-BINDING FACTOR, RUNTDOMAIN, ALPHA SUBUNIT 1; CBFA1 RWS GDB: 9956195 SENSITIVITY SCA1 GDB:119588 SPINOCEREBELLAR ATAXIA 1; SCA1 SCZD3 GDB: 635974 DISORDER-3;SCZD3 SIASD GDB: 433552 SIALIC ACID STORAGE DISEASE; SIASD SOD2 GDB:119597 SUPEROXIDE DISMUTASE 2, MITOCHONDRIAL; SOD2 ST8 GDB: 6118456OVARIAN TUMOR TAP1 GDB: 132668 TRANSPORTER 1, ABC; TAP1 TAP2 GDB: 132669TRANSPORTER 2, ABC; TAP2 TFAP2B GDB: 681506 DUCTUS ARTERIOSUS; PDATRANSCRIPTION FACTOR AP-2 BETA; TFAP2B TNDM GDB: 9956265 DIABETESMELLITUS, TRANSIENT NEONATAL TNF GDB: 120441 TUMOR NECROSIS FACTOR; TNFTPBG GDB: 125568 TROPHOBLAST GLYCOPROTEIN; TPBG; M6P1 TPMT GDB: 209025THIOPURINE S-METHYLTRANSFERASE; TPMT TULP1 GDB: 6199353 TUBBY-LIKEPROTEIN 1; TULP1 WISP3 GDB: 9957361 ARTHROPATHY, PROGRESSIVEPSEUDORHEUMATOID, OF CHILDHOOD

TABLE 8 Genes, Locations and Genetic Disorders on Chromosome 7 Gene GDBAccession ID OMIM Link AASS GDB: 11502144 HYPERLYSINEMIA ABCB1 GDB:120712 P-GLYCOPROTEIN-1; PGY1 ABCB4 GDB: 120713 P-GLYCOPROTEIN-3; PGY3ACHE GDB: 118746 ACETYLCHOLINESTERASE BLOOD GROUP - Yt SYSTEM; YT AQP1GDB: 129082 AQUAPORIN-1; AQP1 BLOOD GROUP - COLTON; CO ASL GDB: 119703ARGININOSUCCINICACIDURIA ASNS GDB: 119706 ASPARAGINE SYNTHETASE; ASNS;AS AUTS1 GDB: 9864226 DISORDER BPGM GDB: 119039 DIPHOSPHOGLYCERATEMUTASE DEFICIENCY OF ERYTHROCYTE C7orf2 GDB: 10794644 ACHEIROPODYCACNA2D1 GDB: 132010 CALCIUM CHANNEL, VOLTAGE-DEPENDENT, L TYPE,ALPHA-2/DELTA SUBUNIT; MALIGNANT HYPERTHERMIA SUSCEPTIBILITY-3 CCM1 GDB:580824 CEREBRAL CAVERNOUS MALFORMATIONS 1; CCM1 CD36 GDB: 138800 CD36ANTIGEN; CD36 CFTR GDB: 120584 CYSTIC FIBROSIS; CF DEFERENS, CONGENITALBILATERAL APLASIA OF; CBAVD; CAVD CHORDOMA GDB: 11498328 CLCN1 GDB:134688 CHLORIDE CHANNEL 1, SKELETAL MUSCLE; CLCN1 CMH6 GDB: 9956392CARDIOMYOPATHY, FAMILIAL HYPERTROPHIC, WITH WOLFF-PARKINSON-WHITE CMT2DGDB: 9953232 CHARCOT-MARIE-TOOTH DISEASE, NEURONAL TYPE, D COL1A2 GDB:119062 COLLAGEN, TYPE I, ALPHA-2 POLYPEPTIDE; COL1A2 OSTEOGENESISIMPERFECTA TYPE I OSTEOGENESIS IMPERFECTA TYPE IV; OI4 CRS GDB: 119073CRANIOSYNOSTOSIS, TYPE 1; CRS1 CYMD GDB: 366594 MACULAR EDEMA, CYSTOIDDFNA5 GDB: 636174 DEAFNESS, AUTOSOMAL DOMINANT NONSYNDROMICSENSORINEURAL, 5; DFNA5 DLD GDB: 120608 LIPOAMIDE DEHYDROGENASEDEFICIENCY, LACTIC ACIDOSIS DUE TO DYT11 GDB: 10013754 MYOCLONUS,HEREDITARY ESSENTIAL EEC1 GDB: 136338 ECTRODACTYLY, ECTODERMALDYSPLASIA, AND CLEFT LIP/PALATE; EEC ELN GDB: 119107 ELASTIN; ELNWILLIAMS-BEUREN SYNDROME; WBS ETV1 GDB: 335229 ETS VARIANT GENE 1; ETV1FKBP6 GDB: 9955215 WILLIAMS-BEUREN SYNDROME; WBS GCK GDB: 127550DIABETES MELLITUS, AUTOSOMAL DOMINANT, TYPE II GLUCOKINASE; GCK GHRHRGDB: 138465 GROWTH HORMONE-RELEASING HORMONE RECEPTOR; GHRHR GHS GDB:9956363 MICROSOMIA WITH RADIAL DEFECTS GLI3 GDB: 119990 PALLISTER-HALLSYNDROME; PHS GLI-KRUPPEL FAMILY MEMBER 3; GLI3 POSTAXIAL POLYDACTYLY,TYPE A1 GREIG CEPHALOPOLYSYNDACTYLY SYNDROME; GCPS GPDS1 GDB: 9956410GLAUCOMA, PIGMENT-DISPERSION TYPE GUSB GDB: 120025 MUCOPOLYSACCHARIDOSISTYPE VII HADH GDB: 120033 HYDROXYACYL-CoA DEHYDROGENASE/3-KETOACYL-CoATHIOLASE/ENOYL-CoA HYDRATASE, HLXB9 GDB: 136411 HOMEO BOX GENE HB9;HLXB9 SACRAL AGENESIS, HEREDITARY, WITH PRESACRAL MASS, ANTERIORMENINGOCELE, HOXA13 GDB: 120656 HOMEO BOX A13; HOXA13 HPFH2 GDB: 128071HEREDITARY PERSISTENCE OF FETAL HEMOGLOBIN, HETEROCELLULAR, INDIAN HRXGDB: 9958999 HRX IAB GDB: 11498909 ANEURYSM, INTRACRANIAL BERRY IMMP2LGDB: 11499195 GILLES DE LA TOURETTE SYNDROME; GTS KCNH2 GDB: 138126 LONGQT SYNDROME, TYPE 2; LQT2 LAMB1 GDB: 119357 LAMININ BETA 1; LAMB1 LEPGDB: 136420 LEPTIN; LEP MET GDB: 120178 MET PROTO-ONCOGENE; MET NCF1GDB: 120222 GRANULOMATOUS DISEASE, CHRONIC, AUTOSOMALCYTOCHROME-b-POSITIVE FORM NM GDB: 119454 NEUTROPHIL CHEMOTACTICRESPONSE; NCR OGDH GDB: 118847 ALPHA-KETOGLUTARATE DEHYDROGENASEDEFICIENCY OPN1SW GDB: 119032 TRITANOPIA PEX1 GDB: 9787110 ZELLWEGERSYNDROME; ZS PEROXIN-1; PEX1 PGAM2 GDB: 120280 PHOSPHOGLYCERATE MUTASE,DEFICIENCY OF M SUBUNIT OF PMS2 GDB: 386406 POSTMEIOTIC SEGREGATIONINCREASED (S. CEREVISIAE)-2; PMS2 PON1 GDB: 120308 PARAOXONASE 1; PON1PPP1R3A GDB: 136797 PROTEIN PHOSPHATASE 1, REGULATORY (INHIBITOR)SUBUNIT 3; PPP1R3 PRSS1 GDB: 119620 PANCREATITIS, HEREDITARY; PCTTPROTEASE, SERINE, 1; PRSS1 PTC GDB: 118744 PHENYLTHIOCARBAMIDE TASTINGPTPN12 GDB: 136846 PROTEIN-TYROSINE PHOSPHATASE, NONRECEPTOR TYPE, 12;PTPN12 RP10 GDB: 138786 RETINITIS PIGMENTOSA-10; RP10 RP9 GDB: 333931RETINITIS PIGMENTOSA-9; RP9 SERPINE1 GDB: 120297 PLASMINOGEN ACTIVATORINHIBITOR, TYPE I; PAI1 SGCE GDB: 9958714 MYOCLONUS, HEREDITARYESSENTIAL SHFM1 GDB: 128195 SPLIT-HAND/FOOT DEFORMITY, TYPE I; SHFD1 SHHGDB: 456309 HOLOPROSENCEPHALY, TYPE 3; HPE3 SONIC HEDGEHOG, DROSOPHILA,HOMOLOG OF; SHH SLC26A3 GDB: 138165 DOWN-REGULATED IN ADENOMA; DRACHLORIDE DIARRHEA, FAMILIAL; CLD SLC26A4 GDB: 5584511 PENDRED SYNDROME;PDS DEAFNESS, NEUROSENSORY, AUTOSOMAL RECESSIVE, 4; DFNB4 SLOS GDB:385950 SMITH-LEMLI-OPITZ SYNDROME SMAD1 GDB: 3763345 SPINAL MUSCULARATROPHY, DISTAL, WITH UPPER LIMB PREDOMINANCE; SMAD1 TBXAS1 GDB: 128744THROMBOXANE A SYNTHASE 1; TBXAS1 TWIST GDB: 135694 ACROCEPHALOSYNDACTYLYTYPE III TWIST, DROSOPHILA, HOMOLOG OF; TWIST ZWS1 GDB: 120511 ZELLWEGERSYNDROME; ZS

TABLE 9 Genes, Locations and Genetic Disorders on Chromosome 8 Gene GDBAccessionID OMIM Link ACHM3 GDB: 9120558 PINGELAPESE BLINDNESS ADRB3GDB: 203869 BETA-3-ADRENERGIC RECEPTOR; ADRB3 ANK1 GDB: 118737SPHEROCYTOSIS, HEREDITARY; HS CA1 GDB: 119047 CARBONIC ANHYDRASE I,ERYTHROCYTE, ELECTROPHORETIC VARIANTS OF; CA1 CA2 GDB: 119739OSTEOPETROSIS WITH RENAL TUBULAR ACIDOSIS CCAL1 GDB: 512892CHONDROCALCINOSIS WITH EARLY-ONSET OSTEOARTHRITIS; CCAL2 CLN8 GDB:252118 EPILEPSY, PROGRESSIVE, WITH MENTAL RETARDATION; EPMR CMT4A GDB:138755 CHARCOT-MARIE-TOOTH NEUROPATHY 4A; CMT4A CNGB3 GDB: 9993286PINGELAPESE BLINDNESS COH1 GDB: 252122 COHEN SYNDROME; COH1 CPP GDB:119798 CERULOPLASMIN; CP CRH GDB: 119804 CORTICOTROPIN-RELEASINGHORMONE; CRH CYP11B1 GDB: 120603 ADRENAL HYPERPLASIA, CONGENITAL, DUE TO11-@BETA-HYDROXYLASE DEFICIENCY CYP11B2 GDB: 120514 CYTOCHROME P450,SUBFAMILY XIB, POLYPEPTIDE 2; CYP11B2 DECR1 GDB: 453934 2,4-@DIENOYL-CoAREDUCTASE; DECR DPYS GDB: 5885803 DIHYDROPYRIMIDINASE; DPYS DURS1 GDB:9958126 DUANE SYNDROME EBS1 GDB: 119856 EPIDERMOLYSIS BULLOSA SIMPLEX,OGNA TYPE ECA1 GDB: 10796318 JUVENILE ABSENCE EGI GDB: 128830 EPILEPSY,GENERALIZED, IDIOPATHIC; EGI EXT1 GDB: 135994 EXOSTOSES, MULTIPLE, TYPEI; EXT1 CHONDROSARCOMA EYA1 GDB: 5215167 BRANCHIOOTORENAL DYSPLASIA EYESABSENT 1; EYA1 FGFR1 GDB: 119913 ACROCEPHALOSYNDACTYLY TYPE V FIBROBLASTGROWTH FACTOR RECEPTOR-1; FGFR1 GNRH1 GDB: 133746 GONADOTROPIN-RELEASINGHORMONE 1; GNRH1 FAMILIAL HYPOGONADOTROPHIC GSR GDB: 119288 GLUTATHIONEREDUCTASE; GSR Gene GDB AccessionID OMIM Link GULOP GDB: 128078 SCURVYHR GDB: 595499 ALOPECIA UNIVERSALIS ATRICHIA WITH PAPULAR LESIONSHAIRLESS, MOUSE, HOMOLOG OF KCNQ3 GDB: 9787230 CONVULSIONS, BENIGNFAMILIAL NEONATAL, TYPE 2; BFNC2 POTASSIUM CHANNEL, VOLTAGE-GATED,SUBFAMILY Q, MEMBER 3 KFM GDB: 265291 KLIPPEL-FEIL SYNDROME; KFS; KFMKWE GDB: 9315120 KERATOLYTIC WINTER ERYTHEMA LGCR GDB: 120698LANGER-GIEDION SYNDROME; LGS LPL GDB: 120700 HYPERLIPOPROTEINEMIA, TYPEI MCPH1 GDB: 9834525 MICROCEPHALY; MCT MOS GDB: 119396 TRANSFORMATIONGENE: ONCOGENE MOS MYC GDB: 120208 TRANSFORMATION GENE: ONCOGENE MYC;MYC NAT1 GDB: 125364 ARYLAMIDE ACETYLASE 1; AAC1 NAT2 GDB: 125365ISONIAZID INACTIVATION NBS1 GDB: 9598211 NIJMEGEN BREAKAGE SYNDROME PLATGDB: 119496 PLASMINOGEN ACTIVATOR, TISSUE; PLAT PLEC1 GDB: 4119073EPIDERMOLYSIS BULLOSA SIMPLEX AND LIMB-GIRDLE MUSCULAR DYSTROPHY PLECTIN1; PLEC1 PRKDC GDB: 234702 SEVERE COMBINED IMMUNODEFICIENCY DISEASE-1;SCID 1 PROTEIN KINASE, DNA-ACTIVATED, CATALYTIC SUBUNIT; PRKDC PXMP3GDB: 131487 PEROXIN-2; PEX2 ZELLWEGER SYNDROME; ZS RP1 GDB: 120352RETINITIS PIGMENTOSA-1; RP1 SCZD6 GDB: 9864736 DISORDER-2; SCZD2 SFTPCGDB: 120373 PULMONARY SURFACTANT APOPROTEIN PSP-C SGM1 GDB: 135350KLIPPEL-FEIL SYNDROME; KFS; KFM SPG5A GDB: 250332 SPASTIC PARAPLEGIA-5A,AUTOSOMAL RECESSIVE; SPG5A STAR GDB: 635457 STEROIDOGENIC ACUTEREGULATORY PROTEIN; STAR TG GDB: 120434 THYROGLOBULIN; TG TRPS1 GDB:594960 TRICHORHINOPHALANGEAL SYNDROME, TYPE I; TRPS1 TTPA GDB: 512364VITAMIN E, FAMILIAL ISOLATED DEFICIENCY OF; VED TOCOPHEROL (ALPHA)TRANSFER PROTEIN; TTPA VMD1 GDB: 119631 MACULAR DYSTROPHY, ATYPICALVITELLIFORM; VMD1 WRN GDB: 128446 WERNER SYNDROME; WRN

TABLE 10 Genes, Locations and Genetic Disorders on Chromosome 9 Gene GDBAccessionID OMIM Link ABCA1 GDB: 305294 ANALPHALIPOPROTEINEMIAATP-BINDING CASSETTE 1; ABC1 ABL1 GDB: 119640 ABELSON MURINE LEUKEMIAVIRAL ONCOGENE HOMOLOG 1; ABL1 ABO GDB: 118956 ABO BLOOD GROUP; ABOADAMTS13 GDB: 9956467 THROMBOCYTOPENIC PURPURA AK1 GDB: 119664 ADENYLATEKINASE-1; AK1 ALAD GDB: 119665 DELTA-AMINOLEVULINATE DEHYDRATASE; ALADALDH1A1 GDB: 119667 ALDEHYDE DEHYDROGENASE-1; ALDH1 ALDOB GDB: 119669FRUCTOSE INTOLERANCE, HEREDITARY AMBP GDB: 120696 PROTEIN HC; HCP AMCD1GDB: 437519 ARTHROGRYPOSIS MULTIPLEX CONGENITA, DISTAL, TYPE 1; AMCD1ASS GDB: 119010 CITRULLINEMIA BDMF GDB: 9954424 BONE DYSPLASIA WITHMEDULLARY FIBROSARCOMA BSCL GDB: 9957720 SEIP SYNDROME C5 GDB: 119734COMPLEMENT COMPONENT-5, DEFICIENCY OF CDKN2A GDB: 335362 MELANOMA,CUTANEOUS MALIGNANT, 2; CMM2 CYCLIN-DEPENDENT KINASE INHIBITOR 2A;CDKN2A CHAC GDB: 6268491 CHOREOACANTHOCYTOSIS; CHAC CHH GDB: 138268CARTILAGE-HAIR HYPOPLASIA; CHH CMD1B GDB: 677147 CARDIOMYOPATHY, DILATED1B; CMD1B COL5A1 GDB: 131457 COLLAGEN, TYPE V, ALPHA-1 POLYPEPTIDE;COL5A1 CRAT GDB: 359759 CARNITINE ACETYLTRANSFERASE; CRAT DBH GDB:119836 DOPAMINE BETA-HYDROXYLASE, PLASMA; DBH DFNB11 GDB: 1220180DEAFNESS, NEUROSENSORY, AUTOSOMAL RECESSIVE, 7; DFNB7 DFNB7 GDB: 636178DEAFNESS, NEUROSENSORY, AUTOSOMAL RECESSIVE, 7; DFNB7 DNAI1 GDB:11500297 IMMOTILE CILIA SYNDROME-1; ICS1 DYS GDB: 137085 DYSAUTONOMIA,FAMILIAL; DYS DYT1 GDB: 119854 DYSTONIA 1, TORSION; DYT1 ENG GDB: 137193ENDOGLIN; ENG EPB72 GDB: 128993 ERYTHROCYTE SURFACE PROTEIN BAND 7.2;EPB72 STOMATOCYTOSIS I FANCC GDB: 132672 FANCONI ANEMIA, COMPLEMENTATIONGROUP C; FACC FBP1 GDB: 141539 FRUCTOSE-1,6-BISPHOPHATASE 1; FBP1 FCMDGDB: 250412 FUKUYAMA-TYPE CONGENITAL MUSCULAR DYSTROPHY; FCMD FRDA GDB:119951 FRIEDREICH ATAXIA 1; FRDA1 GALT GDB: 119971 GALACTOSEMIA GLDCGDB: 128611 HYPERGLYCINEMIA, ISOLATED NONKETOTIC, TYPE I; NKH1 GNE GDB:9954891 INCLUSION BODY MYOPATHY; IBM2 GSM1 GDB: 9784210 GENIOSPASM 1;GSM1 GSN GDB: 120019 AMYLOIDOSIS V GELSOLIN; GSN HSD17B3 GDB: 347487PSEUDOHERMAPHRODITISM, MALE, WITH GYNECOMASTIA HSN1 GDB: 3853677NEUROPATHY, HEREDITARY SENSORY, TYPE 1 IBM2 GDB: 3801447 INCLUSION BODYMYOPATHY; IBM2 LALL GDB: 9954426 LEUKEMIA, ACUTE, WITH LYMPHOMATOUSFEATURES; LALL LCCS GDB: 386141 LETHAL CONGENITAL CONTRACTURE SYNDROME;LCCS LGMD2H GDB: 9862233 DYSTROPHY, HUTTERITE TYPE LMX1B GDB: 9834526NAIL-PATELLA SYNDROME; NPS1 MLLT3 GDB: 138172 MYELOID/LYMPHOID OR MIXEDLINEAGE LEUKEMIA, TRANSLOCATED TO, 3; MLLT3 MROS GDB: 9954430 MELKERSSONSYNDROME MSSE GDB: 128019 EPITHELIOMA, SELF-HEALING SQUAMOUS NOTCH1 GDB:131400 NOTCH, DROSOPHILA, HOMOLOG OF, 1; NOTCH 1 ORM1 GDB: 120250OROSOMUCOID 1; ORM1 PAPPA GDB: 134729 PREGNANCY-ASSOCIATED PLASMAPROTEIN A; PAPPA PIP5K1B GDB: 686238 FRIEDREICH ATAXIA 1; FRDA1 PTCHGDB: 119447 BASAL CELL NEVUS SYNDROME; BCNS PATCHED, DROSOPHILA, HOMOLOGOF; PTCH PTGS1 GDB: 128070 PROSTAGLANDIN-ENDOPEROXIDASE SYNTHASE 1;PTGS1 RLN1 GDB: 119552 RELAXIN; RLN1 RLN2 GDB: 119553 RELAXIN, OVARIAN,OF PREGNANCY RMRP GDB: 120348 MITOCHONDRIAL RNA-PROCESSINGENDORIBONUCLEASE, RNA COMPONENT OF; RMRP; CARTILAGE-HAIR HYPOPLASIA; CHHROR2 GDB: 136454 BRACHYDACTYLY, TYPE B; BDB ROBINOW SYNDROME, RECESSIVEFORM NEUROTROPHIC TYROSINE KINASE, RECEPTOR-RELATED 2; NTRKR2 RPD1 GDB:9954440 RETINITIS PIGMENTOSA-DEAFNESS SYNDROME 1, AUTOSOMAL DOMINANTSARDH GDB: 9835149 SARCOSINEMIA TDFA GDB: 9954420 FACTOR, AUTOSOMAL TEKGDB: 344185 VENOUS MALFORMATIONS, MULTIPLE CUTANEOUS AND MUCOSAL; VMCMTEK TYROSINE KINASE, ENDOTHELIAL; TEK TSC1 GDB: 120735 TUBEROUSSCLEROSIS-1; TSC1 TYRP1 GDB: 126337 TYROSINASE-RELATED PROTEIN 1; TYRP1ALBINISM III XANTHISM XPA GDB: 125363 XERODERMA PIGMENTOSUM I

TABLE 11 Genes, Locations and Genetic Disorders on Chromosome 10 GDBGene Accession ID OMIM Link CACNB2 GDB: 132014 CALCIUM CHANNEL,VOLTAGE-DEPENDENT, BETA-2 SUBUNIT; CACNB2 COL17A1 GDB: 131396 COLLAGEN,TYPE XVII, ALPHA-1 POLYPEPTIDE; COL17A1 CUBN GDB: 636049 MEGALOBLASTICANEMIA 1; MGA1 CYP17 GDB: 119829 ADRENAL HYPERPLASIA, CONGENITAL, DUE TO17-ALPHA-HYDROXYLASE DEFICIENCY CYP2C19 GDB: 119831 CYTOCHROME P450,SUBFAMILY IIC, POLYPEPTIDE 19; CYP2C19 CYP2C9 GDB: 131455 CYTOCHROMEP450, SUBFAMILY IIC, POLYPEPTIDE 9; CYP2C9 EGR2 GDB: 120611 EARLY GROWTHRESPONSE-2; EGR2 EMX2 GDB: 277886 EMPTY SPIRACLES, DROSOPHILA, 2,HOMOLOG OF; EMX2 EPT GDB: 9786112 EPILEPSY, PARTIAL; EPT ERCC6 GDB:119882 EXCISION-REPAIR CROSS-COMPLEMENTING RODENT REPAIR DEFICIENCY,COMPLEMENTATION FGFR2 GDB: 127273 ACROCEPHALOSYNDACTYLY TYPE VFIBROBLAST GROWTH FACTOR RECEPTOR-2; FGFR2 HK1 GDB: 120044 HEXOKINASE-1;HK1 HOX11 GDB: 119607 HOMEO BOX-11; HOX11 HPS GDB: 127359HERMANSKY-PUDLAK SYNDROME; HPS IL2RA GDB: 119345 INTERLEUKIN-2 RECEPTOR,ALPHA; IL2RA LGI1 GDB: 9864936 EPILEPSY, PARTIAL; EPT LIPA GDB: 120153WOLMAN DISEASE MAT1A GDB: 129077 METHIONINE ADENOSYLTRANSFERASEDEFICIENCY MBL2 GDB: 120167 MANNOSE-BINDING PROTEIN, SERUM; MBP1 MKI67GDB: 120185 PROLIFERATION-RELATED Ki-67 ANTIGEN; MKI67 MXI1 GDB: 137182MAX INTERACTING PROTEIN 1; MXI1 OAT GDB: 120246 ORNITHINEAMINOTRANSFERASE DEFICIENCY OATL3 GDB: 215803 ORNITHINE AMINOTRANSFERASEDEFICIENCY PAX2 GDB: 138771 PAIRED BOX HOMEOTIC GENE 2; PAX2 PCBD GDB:138478 PTERIN-4-ALPHA-CARBINOLAMINE DEHYDRATASE; PCBD PRIMAPTERINURIAPEO1 GDB: 632784 PROGRESSIVE EXTERNAL OPHTHALMOPLEGIA; PEO PHYH GDB:9263423 REFSUM DISEASE PHYTANOYL-CoA HYDROXYLASE; PHYH PNLIP GDB: 127916LIPASE, CONGENITAL ABSENCE OF PANCREATIC PSAP GDB: 120366 PROSAPOSIN;PSAP PTEN GDB: 6022948 MACROCEPHALY, MULTIPLE LIPOMAS AND HEMANGIOMATAMULTIPLE HAMARTOMA SYNDROME; MHAM POLYPOSIS, JUVENILE INTESTINALPHOSPHATASE AND TENSIN HOMOLOG; PTEN RBP4 GDB: 120342 RETINOL-BINDINGPROTEIN, PLASMA; RBP4 RDPA GDB: 9954445 REFSUM DISEASE WITH INCREASEDPIPECOLICACIDEMIA; RDPA RET GDB: 120346 RET PROTO-ONCOGENE; RET SDF1GDB: 433267 STROMAL CELL-DERIVED FACTOR 1; SDF1 SFTPA1 GDB: 119593PULMONARY SURFACTANT APOPROTEIN PSP-A; PSAP SFTPD GDB: 132674 PULMONARYSURFACTANT APOPROTEIN PSP-D; PSP-D SHFM3 GDB: 386030 SPLIT-HAND/FOOTMALFORMATION, TYPE 3; SHFM3 SIAL GDB: 6549924 NEURAMINIDASE DEFICIENCYTHC2 GDB: 10794765 THROMBOCYTOPENIA TNFRSF6 GDB: 132671 APOPTOSISANTIGEN 1; APT1 UFS GDB: 6380714 UROFACIAL SYNDROME; UFS UROS GDB:128112 PORPHYRIA, CONGENITAL ERYTHROPOIETIC; CEP

TABLE 12 Genes, Locations and Genetic Disorders on Chromosome 11 GeneGDB Accession ID OMIM Link AA GDB: 568984 ATROPHIA AREATA; AA ABCC8 GDB:591370 SULFONYLUREA RECEPTOR; SUR PERSISTENT HYPERINSULINEMICHYPOGLYCEMIA OF INFANCY ACAT1 GDB: 126861ALPHA-METHYLACETOACETICACIDURIA ALX4 GDB: 10450304 PARIETAL FORAMINA,SYMMETRIC; PFM AMPD3 GDB: 136013 ADENOSINE MONOPHOSPHATE DEAMINASE-3;AMPD3 ANC GDB: 9954484 CANAL CARCINOMA APOA1 GDB: 119684 AMYLOIDOSIS,FAMILIAL VISCERAL APOLIPOPROTEIN A-I OF HIGH DENSITY LIPOPROTEIN; APOA1APOA4 GDB: 119000 APOLIPOPROTEIN A-IV; APOA4 APOC3 GDB: 119001APOLIPOPROTEIN C-III; APOC3 ATM GDB: 593364 ATAXIA-TELANGIECTASIA; ATBSCL2 GDB: 9963996 SEIP SYNDROME BWS GDB: 120567 BECKWITH-WIEDEMANNSYNDROME; BWS CALCA GDB: 120571 CALCITONIN/CALCITONIN-RELATEDPOLYPEPTIDE, ALPHA; CALCA CAT GDB: 119049 CATALASE; CAT CCND1 GDB:128222 LEUKEMIA, CHRONIC LYMPHATIC; CLL CYCLIN D1; CCND1 CD3E GDB:119764 CD3E ANTIGEN, EPSILON POLYPEPTIDE; CD3E CD3G GDB: 119765 T3T-CELL ANTIGEN, GAMMA CHAIN; T3G; CD3G CD59 GDB: 119769 CD59 ANTIGENP18-20; CD59 HUMAN LEUKOCYTE ANTIGEN MIC11; MIC11 CDKN1C GDB: 593296CYCLIN-DEPENDENT KINASE INHIBITOR 1C; CDKN1C CLN2 GDB: 125228CEROID-LIPOFUSCINOSIS, NEURONAL 2, LATE INFANTILE TYPE; CLN2 CNTF GDB:125919 CILIARY NEUROTROPHIC FACTOR; CNTF CPT1A GDB: 597642 HYPOGLYCEMIA,HYPOKETOTIC, WITH DEFICIENCY OF CARNITINE PALMITOYLTRANSFERASE CARNITINEPALMITOYLTRANSFERASE I, LIVER; CPT1A CTSC GDB: 642234 KERATOSISPALMOPLANTARIS WITH PERIODONTOPATHIA KERATOSIS PALMOPLANTARIS WITHPERIODONTOPATHIA AND ONYCHOGRYPOSIS CATHEPSIN C; CTSC DDB1 GDB: 595014DNA DAMAGE-BINDING PROTEIN; DDB1 DDB2 GDB: 595015 DNA DAMAGE-BINDINGPROTEIN-2; DDB2 DHCR7 GDB: 9835302 SMITH-LEMLI-OPITZ SYNDROME DLAT GDB:118785 CIRRHOSIS, PRIMARY; PBC DRD4 GDB: 127782 DOPAMINE RECEPTOR D4;DRD4 ECB2 GDB: 9958955 POLYCYTHEMIA, BENIGN FAMILIAL ED4 GDB: 9837373DYSPLASIA, MARGARITA TYPE EVR1 GDB: 134029 EXUDATIVE VITREORETINOPATHY,FAMILIAL; EVR EXT2GDB: 344921EXOSTOSES, MULTIPLE, TYPE II; EXT2CHONDROSARCOMA F2 GDB: 119894 COAGULATION FACTOR II; F2 FSHB GDB: 119955FOLLICLE-STIMULATING HORMONE, BETA POLYPEPTIDE; FSHB FTH1 GDB: 120617FERRITIN HEAVY CHAIN 1; FTH1 GIF GDB: 118800 PERNICIOUS ANEMIA,CONGENITAL, DUE TO DEFECT OF INTRINSIC FACTOR GSD1B GDB: 9837619GLYCOGEN STORAGE DISEASE Ib GSD1C GDB: 9837637 STORAGE DISEASE Ic HBBGDB: 119297 HEMOGLOBIN—BETA LOCUS; HBB HBBP1 GDB: 120035 HEMOGLOBIN—BETALOCUS; HBB HBD GDB: 119298 HEMOGLOBIN—DELTA LOCUS; HBD HBE1 GDB: 119299HEMOGLOBIN—EPSILON LOCUS; HBE1 HBG1 GDB: 119300 HEMOGLOBIN, GAMMA A;HBG1 HBG2 GDB: 119301 HEMOGLOBIN, GAMMA G; HBG2 HMBS GDB: 120528PORPHYRIA, ACUTE INTERMITTENT; AIP HND GDB: 9954478 HARTNUP DISORDERHOMG2 GDB: 9956484 MAGNESIUM WASTING, RENAL HRAS GDB: 120684 BLADDERCANCER V-HA-RAS HARVEY RAT SARCOMA VIRAL ONCOGENE HOMOLOG; HRAS HVBS1GDB: 120069 CANCER, HEPATOCELLULAR IDDM2 GDB: 128530 DIABETES MELLITUS,INSULIN-DEPENDENT, 2 DIABETES MELLITUS, JUVENILE-ONSETINSULIN-DEPENDENT; IDDM IGER GDB: 119696 IgE RESPONSIVENESS, ATOPIC;IGER INS GDB: 119349 INSULIN; INS JBS GDB: 120111 JACOBSEN SYNDROME; JBSKCNJ11 GDB: 7009893 POTASSIUM CHANNEL, INWARDLY-RECTIFYING, SUBFAMILY J,MEMBER 11; KCNJ11 PERSISTENT HYPERINSULINEMIC HYPOGLYCEMIA OF INFANCYKCNJ1 GDB: 204206 POTASSIUM CHANNEL, INWARDLY-RECTIFYING, SUBFAMILY J,MEMBER 1; KCNJ1 KCNQ1 GDB: 741244 LONG QT SYNDROME, TYPE 1; LQT1 LDHAGDB: 120141 LACTATE DEHYDROGENASE-A; LDHA LRP5 GDB: 9836818OSTEOPOROSIS-PSEUDOGLIOMA SYNDROME; OPPG HIGH BONE MASS MEN1 GDB: 120173MULTIPLE ENDOCRINE NEOPLASIA, TYPE 1; MEN1 MLL GDB: 128819MYELOID/LYMPHOID OR MIXED-LINEAGE LEUKEMIA; MLL MTACR1 GDB: 125743MULTIPLE TUMOR ASSOCIATED CHROMOSOME REGION 1; MTACR1 MYBPC3 GDB: 579615CARDIOMYOPATHY, FAMILIAL HYPERTROPHIC, 4; CMH4 MYOSIN-BINDING PROTEIN C,CARDIAC; MYBPC3 MYO7A GDB: 132543 MYOSIN VIIA; MYO7A DEAFNESS,NEUROSENSORY, AUTOSOMAL RECESSIVE, 2; DFNB2 DEAFNESS, AUTOSOMAL DOMINANTNONSYNDROMIC SENSORINEURAL, 11; DFNA11 NNO1 GDB: 10450513 SIMPLE,AUTOSOMAL DOMINANT OPPG GDB: 3789438 OSTEOPOROSIS-PSEUDOGLIOMA SYNDROME;OPPG OPTB1 GDB: 9954474 OSTEOPETROSIS, AUTOSOMAL RECESSIVE PAX6 GDB:118997 PAIRED BOX HOMEOTIC GENE 6; PAX6 PC GDB: 119472 PYRUVATECARBOXYLASE DEFICIENCY PDX1 GDB: 9836634 PYRUVATE DEHYDROGENASE COMPLEX,COMPONENT X PGL2 GDB: 511177 PARAGANGLIOMAS, FAMILIAL NONCHROMAFFIN, 2;PGL2 PGR GDB: 119493 PROGESTERONE RESISTANCE PORC GDB: 128610 PORPHYRIA,CHESTER TYPE; PORC PTH GDB: 119522 PARATHYROID HORMONE; PTH PTS GDB:118856 6-@PYRUVOYLTETRAHYDROPTERIN SYNTHASE; PTS PVRL1 GDB: 583951ECTODERMAL DYSPLASIA, CLEFT LIP AND PALATE, HAND AND FOOT DEFORMITY,DYSPLASIA, MARGARITA TYPE POLIOVIRUS RECEPTOR RELATED; PVRR PYGM GDB:120329 GLYCOGEN STORAGE DISEASE V RAG1 GDB: 120334 RECOMBINATIONACTIVATING GENE-1; RAG1 RAG2 GDB: 125186 RECOMBINATION ACTIVATINGGENE-2; RAG2 ROM1 GDB: 120350 ROD OUTER SEGMENT PROTEIN-1; ROM1 SAA1GDB: 120364 SERUM AMYLOID A1; SAA1 SCA5 GDB: 378219 SPINOCEREBELLARATAXIA 5; SCA5 SCZD2 GDB: 118874 DISORDER-2; SCZD2 SDHD GDB: 132456PARAGANGLIOMAS, FAMILIAL NONCHROMAFFIN, 1; PGL1 SERPING1 GDB: 119041ANGIONEUROTIC EDEMA, HEREDITARY; HANE SMPD1 GDB: 128144 NIEMANN-PICKDISEASE TCIRG1 GDB: 9956269 OSTEOPETROSIS, AUTOSOMAL RECESSIVE TCL2 GDB:9954468 LEUKEMIA, ACUTE T-CELL; ATL TECTA GDB: 6837718 DEAFNESS,AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 8; DFNA8 DEAFNESS,AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 12; DFNA12 TH GDB: 119612TYROSINE HYDROXYLASE; TH TREH GDB: 9958953 TREHALASE TSG101 GDB: 1313414TUMOR SUSCEPTIBILITY GENE 101; TSG101 TYR GDB: 120476 ALBINISM I USH1CGDB: 132544 USHER SYNDROME, TYPE IC; USH1C VMD2 GDB: 133795 VITELLIFORMMACULAR DYSTROPHY; VMD2 VRNI GDB: 135662 VITREORETINOPATHY, NEOVASCULARINFLAMMATORY; VRNI WT1 GDB: 120496 FRASIER SYNDROME WILMS TUMOR; WT1 WT2GDB: 118886 MULTIPLE TUMOR ASSOCIATED CHROMOSOME REGION 1; MTACR1 ZNF145GDB: 230064 PROMYELOCYTIC LEUKEMIA ZINC FINGER; PLZF

TABLE 13 Genes, Locations and Genetic Disorders on Chromosome 12 GDBAccession Gene ID OMIM Link A2M GDB: 119639 ALPHA-2-MACROGLOBULIN; A2MAAAS GDB: 9954498 GLUCOCORTICOID DEFICIENCY AND ACHALASIA ACADS GDB:118959 ACYL-CoA DEHYDROGENASE, SHORT-CHAIN; ACADS ACLS GDB: 136346ACROCALLOSAL SYNDROME; ACLS ACVRL1 GDB: 230240 OSLER-RENDU-WEBERSYNDROME 2; ORW2 ACTIVIN A RECEPTOR, TYPE II-LIKE KINASE 1; ACVRL1 ADHRGDB: 9954488 VITAMIN D-RESISTANT RICKETS, AUTOSOMAL DOMINANT ALDH2 GDB:119668 ALDEHYDE DEHYDROGENASE-2; ALDH2 AMHR2 GDB: 696210 ANTI-MULLERIANHORMONE TYPE II RECEPTOR; AMHR2 AOM GDB: 118998 STICKLER SYNDROME, TYPEI; STL1 AQP2 GDB: 141853 AQUAPORIN-2; AQP2 DIABETES INSIPIDUS, RENALTYPE DIABETES INSIPIDUS, RENAL TYPE, AUTOSOMAL RECESSIVE ATD GDB: 696353ASPHYXIATING THORACIC DYSTROPHY; ATD ATP2A2 GDB: 119717 ATPase,Ca(2+)-TRANSPORTNG, SLOW-TWITCH; ATP2A2 DARIER-WHITE DISEASE; DAR BDCGDB: 5584359 BRACHYDACTYLY, TYPE C; BDC C1R GDB: 119729 COMPLEMENTCOMPONENT-C1r, DEFICIENCY OF CD4 GDB: 119767 T-CELL ANTIGEN T4/LEU3; CD4CDK4 GDB: 204022 CYCLIN-DEPENDENT KINASE 4; CDK4 CNA1 GDB: 252119 CORNEAPLANA 1; CNA1 COL2A1 GDB: 119063 STICKLER SYNDROME, TYPE I; STL1COLLAGEN, TYPE II, ALPHA-1 CHAIN; COL2A1 ACHONDROGENESIS, TYPE II; ACG2CYP27B1 GDB: 9835730 PSEUDOVITAMIN D DEFICIENCY RICKETS; PDDR DRPLA GDB:270336 DENTATORUBRAL-PALLIDOLUYSIAN ATROPHY; DRPLA ENUR2 GDB: 666422ENURESIS, NOCTURNAL, 2; ENUR2 FEOM1 GDB: 345037 FIBROSIS OF EXTRAOCULARMUSCLES, CONGENITAL; FEOM FPF GDB: 9848880 PERIODIC FEVER, AUTOSOMALDOMINANT GNB3 GDB: 120005 GUANINE NUCLEOTIDE-BINDING PROTEIN, BETAPOLYPEPTIDE-3; GNB3 GNS GDB: 120006 MUCOPOLYSACCHARIDOSIS TYPE IIID HALGDB: 120746 HISTIDINEMIA HBP1 GDB: 701889 BRACHYDACTYLY WITHHYPERTENSION HMGIC GDB: 362658 HIGH MOBILITY GROUP PROTEIN ISOFORM I-C;HMGIC HMN2 GDB: 9954508 MUSCULAR ATROPHY, ADULT SPINAL HPD GDB: 135978TYROSINEMIA, TYPE III IGF1 GDB: 120081 INSULINLIKE GROWTH FACTOR 1; IGF1KCNA1 GDB: 127903 POTASSIUM VOLTAGE-GATED CHANNEL, SHAKER-RELATEDSUBFAMILY, MEMBER KERA GDB: 252121 CORNEA PLANA 2; CNA2 KRAS2 GDB:120120 V-KI-RAS2 KIRSTEN RAT SARCOMA 2 VIRAL ONCOGENE HOMOLOG; KRAS2KRT1 GDB: 128198 KERATIN 1; KRT1 KRT2A GDB: 407640 ICHTHYOSIS, BULLOUSTYPE KERATIN 2A; KRT2A KRT3 GDB: 136276 KERATIN 3; KRT3 KRT4 GDB: 120697KERATIN 4; KRT4 KRT5 GDB: 128110 EPIDERMOLYSIS BULLOSA HERPETIFORMIS,DOWLING-MEARA TYPE KERATIN 5; KRT5 KRT6A GDB: 128111 KERATIN 6A; KRT6AKRT6B GDB: 128113 KERATIN 6B; KRT6B PACHYONYCHIA CONGENITA,JACKSON-LAWLER TYPE KRTHB6 GDB: 702078 MONILETHRIX KERATIN, HAIR BASIC(TYPE II) 6 LDHB GDB: 120147 LACTATE DEHYDROGENASE-B; LDHB LYZ GDB:120160 AMYLOIDOSIS, FAMILIAL VISCERAL LYSOZYME; LYZ MGCT GDB: 9954504TESTICULAR TUMORS MPE GDB: 120191 MALIGNANT PROLIFERATION OF MVK GDB:134189 MEVALONICACIDURIA MYL2 GDB: 128829 MYOSIN, LIGHT CHAIN,REGULATORY VENTRICULAR; MYL2 NS1 GDB: 439388 NOONAN SYNDROME 1; NS1 OAPGDB: 120245 OSTEOARTHROSIS, PRECOCIOUS; OAP PAH GDB: 119470PHENYLKETONURIA; PKU1 PPKB GDB: 696352 PALMOPLANTAR KERATODERMA,BOTHNIAN TYPE; PPKB PRB3 GDB: 119513 PAROTID SALIVARY GLYCOPROTEIN; G1PXR1 GDB: 433739 ZELLWEGER SYNDROME; ZS PEROXISOME RECEPTOR 1; PXR1 RLSGDB: ACROMELALGIA, HEREDITARY 11501392 RSN GDB: 139158 RESTIN; RSN SASGDB: 128054 SARCOMA AMPLIFIED SEQUENCE; SAS SCA2 GDB: 128034SPINOCEREBELLAR ATAXIA 2; SCA2 ATAXIN-2; ATX2 SCNN1A GDB: 366596 SODIUMCHANNEL, NONVOLTAGE-GATED, 1; SCNN1A SMAL GDB: 9954506 SPINAL MUSCULARATROPHY, CONGENITAL NONPROGRESSIVE, OF LOWER LIMBS SPPM GDB: 9954502SCAPULOPERONEAL MYOPATHY; SPM SPSMA GDB: 9954510 SCAPULOPERONEALAMYOTROPHY, NEUROGENIC, NEW ENGLAND TYPE TBX3 GDB: 681969 ULNAR-MAMMARYSYNDROME; UMS T-BOX 3; TBX3 TBX5 GDB: 6175917 HOLT-ORAM SYNDROME; HOST-BOX 5; TBX5 TCF1 GDB: 125297 TRANSCRIPTION FACTOR 1, HEPATIC; TCF1MATURITY-ONSET DIABETES OF THE YOUNG, TYPE III; MODY3 TPI1 GDB: 119617TRIOSEPHOSPHATE ISOMERASE 1; TPI1 TSC3 GDB: 127930 SCLEROSIS-3; TSC3 ULRGDB: 594089 UTERINE VDR GDB: 120487 VITAMIN D-RESISTANT RICKETS WITHEND-ORGAN UNRESPONSIVENESS TO 1,25-DIHYDROXYCHOLECALCIFEROL VITAMIN DRECEPTOR; VDR VWF GDB: 119125 VON WILLEBRAND DISEASE; VWD

TABLE 14 Genes, Locations and Genetic Disorders on Chromosome 13 GeneGDB Accession ID OMIM Link ATP7B GDB: 120494 WILSON DISEASE; WND BRCA2GDB: 387848 BREAST CANCER 2, EARLY-ONSET; BRCA2 BRCD1 GDB: 9954522BREAST CANCER, DUCTAL, 1; BRCD1 CLN5 GDB: 230991 CEROID-LIPOFUSCNOSIS,NEURONAL 5; CLN5 CPB2 GDB: 129546 CARBOXYPEPTIDASE B2, PLASMA; CPB2 ED2GDB: 9834522 ECTODERMAL DYSPLASIA, HIDROTIC; HED EDNRB GDB: 129075ENDOTHELIN-B RECEPTOR; EDNRB HIRSCHSPRUNG DISEASE-2; HSCR2 ENUR1 GDB:594516 ENURESIS, NOCTURNAL, 1; ENUR1 ERCC5 GDB: 120515 EXCISION-REPAIR,COMPLEMENTING DEFECTIVE, IN CHINESE HAMSTER, 5; ERCC5 F10 GDB: 119890 X,QUANTITATIVE VARIATION IN FACTOR X DEFICIENCY; F10 F7 GDB: 119897 FACTORVII DEFICIENCY GJB2 GDB: 125247 GAP JUNCTION PROTEIN, BETA-2, 26 KD;GJB2 DEAFNESS, NEUROSENSORY, AUTOSOMAL RECESSIVE, 1; DFNB1 DEAFNESS,AUTOSOMAL DOMINANT NONSYNDROMIC SENSORINEURAL, 3; DFNA3 GJB6 GDB:9958357 ECTODERMAL DYSPLASIA, HIDROTIC; HED DEAFNESS, AUTOSOMAL DOMINANTNONSYNDROMIC SENSORINEURAL, 3; DFNA3 IPF1 GDB: 448899 INSULIN PROMOTERFACTOR 1; IPF1 MBS1 GDB: 128365 MOEBIUS SYNDROME; MBS MCOR GDB: 9954520CONGENITAL PCCA GDB: 119473 GLYCINEMIA, KETOTIC, I RB1 GDB: 118734BLADDER CANCER RETINOBLASTOMA; RB1 RHOK GDB: 371598 RHODOPSIN KINASE;RHOK SCZD7 GDB: 9864734 DISORDER-2; SCZD2 SGCG GDB: 3763329 MUSCULARDYSTROPHY, LIMB GIRDLE, TYPE 2C; LGMD2C SLC10A2 GDB: 677534 SOLUTECARRIER FAMILY 10, MEMBER 2; SLC10A2 SLC25A15 GDB: 120042HYPERORNITHINEMIA-HYPERAMMONEMIA- HOMOCITRULLINURIA SYNDROME STARP1 GDB:635459 STEROIDOGENIC ACUTE REGULATORY PROTEIN; STAR ZNF198 GDB: 6382650ZINC FINGER PROTEIN-198; ZNF198

TABLE 15 Genes, Locations and Genetic Disorders on Chromosome 14 GeneGDB Accession ID OMIM Link ACHM1 GDB: 132458 COLORBLINDNESS, TOTAL ARVD1GDB: 371339 ARRHYTHMOGENIC RIGHT VENTRICULAR DYSPLASIA, FAMILIAL, 1;ARVD1 CTAA1 GDB: 265299 CATARACT, ANTERIOR POLAR 1; CTAA1 DAD1 GDB:407505 DEFENDER AGAINST CELL DEATH; DAD1 DFNB5 GDB: 636176 DEAFNESS,NEUROSENSORY, AUTOSOMAL RECESSIVE, 5; DFNB5 EML1 GDB: 6328385 USHERSYNDROME, TYPE IA; USH1A GALC GDB: 119970 KRABBE DISEASE GCH1 GDB:118798 DYSTONIA, PROGRESSIVE, WITH DIURNAL VARIATION GTP CYCLOHYDROLASEI DEFICIENCY GTP CYCLOHYDROLASE I; GCH1 HE1 GDB: 9957680 MALFORMATIONS,MULTIPLE, WITH LIMB ABNORMALITIES AND HYPOPITUITARISM IBGC1 GDB:10450404 CEREBRAL CALCIFICATION, NONARTERIOSCLEROTIC IGH@ GDB: 118731IgA CONSTANT HEAVY CHAIN 1; IGHA1 IMMUNOGLOBULIN: D (DIVERSITY) REGIONOF HEAVY CHAIN IgA CONSTANT HEAVY CHAIN 2; IGHA2 IMMUNOGLOBULIN: J(JOINING) LOCI OF HEAVY CHAIN; IGHJ IMMUNOGLOBULIN: HEAVY Mu CHAIN; Mu1;IGHM1 IMMUNOGLOBULIN: VARIABLE REGION OF HEAVY CHAINS—Hv1; IGHV IgGHEAVY CHAIN LOCUS; IGHG1 IMMUNOGLOBULIN Gm-2; IGHG2 IMMUNOGLOBULIN Gm-3;IGHG3 IMMUNOGLOBULIN Gm-4; IGHG4 IMMUNOGLOBULIN: HEAVY DELTA CHAIN; IGHDIMMUNOGLOBULIN: HEAVY EPSILON CHAIN; IGHE IGHC group GDB: 9992632 IgACONSTANT HEAVY CHAIN 1; IGHA1 IgA CONSTANT HEAVY CHAIN 2; IGHA2IMMUNOGLOBULIN: HEAVY Mu CHAIN; Mu1; IGHM1 IgG HEAVY CHAIN LOCUS; IGHG1IMMUNOGLOBULIN Gm-2; IGHG2 IMMUNOGLOBULIN Gm-3; IGHG3 IMMUNOGLOBULINGm-4; IGHG4 IMMUNOGLOBULIN: HEAVY DELTA CHAIN; IGHD IMMUNOGLOBULIN:HEAVY EPSILON CHAIN; IGHE IGHG1 GDB: 120085 IgG HEAVY CHAIN LOCUS; IGHG1IGHM GDB: 120086 IMMUNOGLOBULIN: HEAVY Mu CHAIN; Mu1; IGHM1 IGHR GDB:9954529 G1(A1) SYNDROME IV GDB: 139274 INVERSUS VISCERUM LTBP2 GDB:453890 LATENT TRANSFORMING GROWTH FACTOR-BETA BINDING PROTEIN 2; LTBP2MCOP GDB: 9954527 MICROPHTHALMOS MJD GDB: 118840 MACHADO-JOSEPH DISEASE;MJD MNG1 GDB: 6540062 GOITER, MULTINODULAR 1; MNG1 MPD1 GDB: 230271MYOPATHY, LATE DISTAL HEREDITARY MPS3C GDB: 9954532MUCOPOLYSACCHARIDOSIS TYPE IIIC MYH6 GDB: 120214 MYOSIN, HEAVYPOLYPEPTIDE 6; MYH6 MYH7 GDB: 120215 MYOSIN, CARDIAC, HEAVY CHAIN, BETA;MYH7 NP GDB: 120239 NUCLEOSIDE PHOSPHORYLASE; NP PABPN1 GDB: 567135OCULOPHARYNGEAL MUSCULAR DYSTROPHY; OPMD OCULOPHARYNGEAL MUSCULARDYSTROPHY, AUTOSOMAL RECESSIVE POLYADENYLATE-BINDING PROTEIN-2; PABP2PSEN1 GDB: 135682 ALZHEIMER DISEASE, FAMILIAL, TYPE 3; AD3 PYGL GDB:120328 GLYCOGEN STORAGE DISEASE VI RPGRIP1 GDB: 11498766 AMAUROSISCONGENITA OF LEBER I SERPINA1 GDB: 120289 PROTEASE INHIBITOR 1; PISERPINA3 GDB: 118955 ALPHA-1-ANTICHYMOTRYPSIN; AACT SERPINA6 GDB: 127865CORTICOSTEROID-BINDING GLOBULIN; CBG SLC7A7 GDB: 9863033DIBASICAMINOACIDURIA II SPG3A GDB: 230126 SPASTIC PARAPLEGIA-3,AUTOSOMAL DOMINANT; SPG3A SPTB GDB: 119602 ELLIPTOCYTOSIS,RHESUS-UNLINKED TYPE HEREDITARY HEMOLYTIC SPECTRIN, BETA, ERYTHROCYTIC;SPTB TCL1A GDB: 250785 T-CELL LYMPHOMA OR LEUKEMIA TCRAV17S1 GDB: 642130T-CELL ANTIGEN RECEPTOR, ALPHA SUBUNIT; TCRA TCRAV5S1 GDB: 451966 T-CELLANTIGEN RECEPTOR, ALPHA SUBUNIT; TCRA TGM1 GDB: 125299 TRANSGLUTAMINASE1; TGM1 ICHTHYOSIS CONGENITA TITF1 GDB: 132588 THYROID TRANSCRIPTIONFACTOR 1; TITF1 TMIP GDB: 9954523 AND ULNA, DUPLICATION OF, WITH ABSENCEOF TIBIA AND RADIUS TRA@ GDB: 120404 T-CELL ANTIGEN RECEPTOR, ALPHASUBUNIT; TCRA TSHR GDB: 125313 THYROTROPIN, UNRESPONSIVENESS TO USH1AGDB: 118885 USHER SYNDROME, TYPE IA; USH1A VP GDB: 120492 PORPHYRIAVARIEGATA

TABLE 16 Genes, Locations and Genetic Disorders on Chromosome 15 GDBAccession Gene ID OMIM Link ACCPN GDB: 5457725 CORPUS CALLOSUM, AGENESISOF, WITH NEURONOPATHY AHO2 GDB: 9954535 HEREDITARY OSTEODYSTROPHY-2;AHO2 ANCR GDB: 119678 ANGELMAN SYNDROME B2M GDB: 119028BETA-2-MICROGLOBULIN; B2M BBS4 GDB: 511199 BARDET-BIEDL SYNDROME, TYPE4; BBS4 BLM GDB: 135698 BLOOM SYNDROME; BLM CAPN3 GDB: 119751 CALPAIN,LARGE POLYPEPTIDE L3; CAPN3 MUSCULAR DYSTROPHY, LIMB-GIRDLE, TYPE 2;LGMD2 CDAN1 GDB: 9823267 DYSERYTHROPOIETIC ANEMIA, CONGENITAL, TYPE ICDAN3 GDB: 386192 DYSERYTHROPOIETIC ANEMIA, CONGENITAL, TYPE III; CDAN3CLN6 GDB: 4073043 CEROID-LIPOFUSCINOSIS, NEURONAL 6, LATE INFANTILE,VARIANT; CLN6 CMH3 GDB: 138299 CARDIOMYOPATHY, FAMILIAL HYPERTROPHIC, 3;CMH3 CYP19 GDB: 119830 CYTOCHROME P450, SUBFAMILY XIX; CYP19 CYP1A1 GDB:120604 CYTOCHROME P450, SUBFAMILY I, POLYPEPTIDE 1; CYP1A1 CYP1A2 GDB:118780 CYTOCHROME P450, SUBFAMILY I, POLYPEPTIDE 2; CYP1A2 DYX1 GDB:1391796 DYSLEXIA, SPECIFIC, 1; DYX1 EPB42 GDB: 127385 HEREDITARYHEMOLYTIC PROTEIN 4.2, ERYTHROCYTIC; EPB42 ETFA GDB: 119121GLUTARICACIDURIA IIA; GA IIA EYCL3 GDB: 4590306 EYE COLOR-3; EYCL3 FAHGDB: 119901 TYROSINEMIA, TYPE I FBN1 GDB: 127115 FIBRILLIN-1; FBN1MARFAN SYNDROME; MFS FES GDB: 119906 V-FES FELINE SARCOMA VIRAL/V-FPSFUJINAMI AVIAN SARCOMA VIRAL ONCOGENE HCVS GDB: 119306 CORONAVIRUS 229ESUSCEPTIBILITY; CVS HEXA GDB: 120040 TAY-SACHS DISEASE; TSD IVD GDB:119354 ISOVALERICACIDEMIA; IVA LCS1 GDB: CHOLESTASIS-LYMPHEDEMA 11500552SYNDROME LIPC GDB: 119366 LIPASE, HEPATIC; LIPC MYO5A GDB: 218824 MYOSINVA; MYO5A OCA2 GDB: 136820 ALBINISM II OTSC1 GDB: 9860473 OTOSCLEROSISPWCR GDB: 120325 PRADER-WILLI SYNDROME RLBP1 GDB: 127341RETINALDEHYDE-BINDING PROTEIN 1,; RLBP1 SLC12A1 GDB: 386121 SOLUTECARRIER FAMILY 12, MEMBER 1; SLC12A1 SPG6 GDB: 511201 SPASTIC PARAPLEGIA6, AUTOSOMAL DOMINANT; SPG6 TPM1 GDB: 127875 TROPOMYOSIN 1; TPM1 UBE3AGDB: 228487 ANGELMAN SYNDROME UBIQUITIN-PROTEIN LIGASE E3A; UBE3A WMSGDB: 5583902 WEILL-MARCHESANI SYNDROME

TABLE 17 Genes, Locations and Genetic Disorders on Chromosome 16 GeneGDB Accession ID OMIM Link ABCC6 GDB: 9315106 PSEUDOXANTHOMA ELASTICUM,AUTOSOMAL DOMINANT; PXE PSEUDOXANTHOMA ELASTICUM, AUTOSOMAL RECESSIVE;PXE ALDOA GDB: 118993 ALDOLASE A, FRUCTOSE-BISPHOSPHATE; ALDOA APRT GDB:119003 ADENINE PHOSPHORIBOSYLTRANSFERASE; APRT ATP2A1 GDB: 119716ATPase, Ca(2+)-TRANSPORTING, FAST-TWITCH 1; ATP2A1 BRODY MYOPATHY BBS2GDB: 229992 BARDET-BIEDL SYNDROME, TYPE 2; BBS2 CARD15 GDB: 11026232SYNOVITIS, GRANULOMATOUS, WITH UVEITIS AND CRANIAL NEUROPATHIES REGIONALENTERITIS CATM GDB: 701219 MICROPHTHALMIA-CATARACT CDH1 GDB: 120484CADHERIN 1; CDH1 CETP GDB: 119773 CHOLESTERYL ESTER TRANSFER PROTEIN,PLASMA; CETP CHST6 GDB: 131407 CORNEAL DYSTROPHY, MACULAR TYPE CLN3 GDB:120593 CEROID-LIPOFUSCINOSIS, NEURONAL 3, JUVENILE; CLN3 CREBBP GDB:437159 RUBINSTEIN SYNDROME CREB-BINDING PROTEIN; CREBBP CTH GDB: 119086CYSTATHIONINURIA CTM GDB: 119819 CATARACT, ZONULAR CYBA GDB: 125238GRANULOMATOUS DISEASE, CHRONIC, AUTOSOMAL CYTOCHROME-b-NEGATIVE FORMCYLD GDB: 701216 EPITHELIOMA, HEREDITARY MULTIPLE BENIGN CYSTIC DHS GDB:9958268 XEROCYTOSIS, HEREDITARY DNASE1 GDB: 132846 DEOXYRIBONUCLEASE I;DNASE1 DPEP1 GDB: 128059 RENAL DIPEPTIDASE ERCC4 GDB: 119113EXCISION-REPAIR, COMPLEMENTING DEFECTIVE, IN CHINESE HAMSTER, 4; ERCC4XERODERMA PIGMENTOSUM, COMPLEMENTATION GROUP F; XPF FANCA GDB: 701221FANCONI ANEMIA, COMPLEMENTATION GROUP A; FACA GALNS GDB: 129085MUCOPOLYSACCHARIDOSIS TYPE IVA GAN GDB: 9864885 NEUROPATHY, GIANTAXONAL; GAN HAGH GDB: 119292 HYDROXYACYL GLUTATHIONE HYDROLASE; HAGHHBA1 GDB: 119293 HEMOGLOBIN—ALPHA LOCUS-1; HBA1 HBA2 GDB: 119294HEMOGLOBIN—ALPHA LOCUS-2; HBA2 HBHR GDB: 9954541 HEMOGLOBIN H-RELATEDMENTAL RETARDATION HBQ1 GDB: 120036 HEMOGLOBIN—THETA-1 LOCUS; HBQ1 HBZGDB: 119302 HEMOGLOBIN—ZETA LOCUS; HBZ HBZP GDB: 120037 HEMOGLOBIN—ZETALOCUS; HBZ HP GDB: 119314 HAPTOGLOBIN; HP HSD11B2 GDB: 409951 CORTISOL11-BETA-KETOREDUCTASE DEFICIENCY IL4R GDB: 118823 INTERLEUKIN-4RECEPTOR; IL4R LIPB GDB: 119365 LIPASE B, LYSOSOMAL ACID; LIPB MC1R GDB:135162 MELANOCORTIN-1 RECEPTOR; MC1R MEFV GDB: 125263 MEDITERRANEANFEVER, FAMILIAL; MEFV MHC2TA GDB: 6268475 MHC CLASS II TRANSACTIVATOR;MHC2TA MLYCD GDB: 11500940 MALONYL CoA DECARBOXYLASE DEFICIENCY PHKBGDB: 120286 PHOSPHORYLASE KINASE, BETA SUBUNIT; PHKB PHKG2 GDB: 140316PHOSPHORYLASE KINASE, TESTIS/LIVER, GAMMA 2; PHKG2 PKD1 GDB: 120293POLYCYSTIC KIDNEYS POLYCYSTIC KIDNEY DISEASE 1; PKD1 PKDTS GDB: 9954545POLYCYSTIC KIDNEY DISEASE, INFANTILE SEVERE, WITH TUBEROUS SCLEROSIS;PMM2 GDB: 438697 CARBOHYDRATE-DEFICIENT GLYCOPROTEIN SYNDROME, TYPE I;CDG1 PHOSPHOMANNOMUTASE 2; PMM2 PXE GDB: 6053895 PSEUDOXANTHOMAELASTICUM, AUTOSOMAL DOMINANT; PXE PSEUDOXANTHOMA ELASTICUM, AUTOSOMALRECESSIVE; PXE SALL1 GDB: 4216161 TOWNES-BROCKS SYNDROME; TBS SAL-LIKE1; SALL1 SCA4 GDB: 250364 SPINOCEREBELLAR ATAXIA 4; SCA4 SCNN1B GDB:434471 SODIUM CHANNEL, NONVOLTAGE-GATED 1 BETA; SCNN1B SCNN1G GDB:568759 SODIUM CHANNEL, NONVOLTAGE-GATED 1 GAMMA; SCNN1G TAT GDB: 120398TYROSINE TRANSAMINASE DEFICIENCY TSC2 GDB: 120466 TUBEROUS SCLEROSIS-2;TSC2 VDI GDB: 119629 DEFECTIVE INTERFERING PARTICLE INDUCTION, CONTROLOF WT3 GDB: 9958957 WILMS TUMOR, TYPE III; WT3

TABLE 18 Genes, Locations and Genetic Disorders on Chromosome 17 GeneGDB Accession ID OMIM Link ABR GDB: 119642 ACTIVE BCR-RELATED GENE; ABRACACA GDB: 120534 ACETYL-CoA CARBOXYLASE DEFICIENCY ACADVL GDB: 1248185ACYL-CoA DEHYDROGENASE, VERY-LONG-CHAIN, DEFICIENCY OF ACE GDB: 119840DIPEPTIDYL CARBOXYPEPTIDASE-1; DCP1 ALDH3A2 GDB: 1316855 SJOGREN-LARSSONSYNDROME; SLS APOH GDB: 118887 APOLIPOPROTEIN H; APOH ASPA GDB: 231014SPONGY DEGENERATION OF CENTRAL NERVOUS SYSTEM AXIN2 GDB: 9864782 CANCEROF COLON BCL5 GDB: 125178 LEUKEMIA/LYMPHOMA, CHRONIC B-CELL, 5; BCL5 BHDGDB: 11498904 WITH TRICHODISCOMAS AND ACROCHORDONS BLMH GDB: 3801467BLEOMYCIN HYDROLASE BRCA1 GDB: 126611 BREAST CANCER, TYPE 1; BRCA1 CACDGDB: 5885801 CHOROIDAL DYSTROPHY, CENTRAL AREOLAR; CACD CCA1 GDB: 118763CATARACT, CONGENITAL, CERULEAN TYPE 1; CCA1 CCZS GDB: 681973 CATARACT,CONGENITAL ZONULAR, WITH SUTURAL OPACITIES; CCZS CHRNB1 GDB: 120587CHOLINERGIC RECEPTOR, NICOTINIC, BETA POLYPEPTIDE 1; CHRNB1 CHRNE GDB:132246 CHOLINERGIC RECEPTOR, NICOTINIC, EPSILON POLYPEPTIDE; CHRNE CMT1AGDB: 119785 CHARCOT-MARIE-TOOTH DISEASE, TYPE 1A; CMT1A NEUROPATHY,HEREDITARY, WITH LIABILITY TO PRESSURE PALSIES; HNPP COL1A1 GDB: 119061COLLAGEN, TYPE I, ALPHA-1 CHAIN; COL1A1 OSTEOGENESIS IMPERFECTA TYPE IOSTEOGENESIS IMPERFECTA TYPE IV; OI4 CORD5 GDB: 568473 CONE-RODDYSTROPHY-5; CORD5 CTNS GDB: 700761 CYSTINOSIS, EARLY-ONSET OR INFANTILENEPHROPATHIC TYPE EPX GDB: 377700 EOSINOPHIL PEROXIDASE; EPX ERBB2 GDB:120613 V-ERB-B2 AVIAN ERYTHROBLASTIC LEUKEMIA VIRAL ONCOGENE HOMOLOG 2;ERBB2 G6PC GDB: 231927 GLYCOGEN STORAGE DISEASE I; GSD-I GAA GDB: 119965GLYCOGEN STORAGE DISEASE II GALK1 GDB: 119246 GALACTOKINASE DEFICIENCYGCGR GDB: 304516 GLUCAGON RECEPTOR, GCGR GFAP GDB: 118799 GLIALFIBRILLARY ACIDIC PROTEIN; GFAP ALEXANDER DISEASE GH1 GDB: 119982 GROWTHHORMONE 1; GH1 GH2 GDB: 119983 GROWTH HORMONE 2; GH2 GP1BA GDB: 118806GIANT PLATELET SYNDROME GPSC GDB: 9954564 FAMILIAL PROGRESSIVESUBCORTICAL GUCY2D GDB: 136012 AMAUROSIS CONGENITA OF LEBER I GUANYLATECYCLASE 2D, MEMBRANE; GUC2D CONE-ROD DYSTROPHY-6; CORD6 ITGA2B GDB:120012 THROMBASTHENIA OF GLANZMANN AND NAEGELI ITGB3 GDB: 120013INTEGRIN, BETA-3; ITGB3 ITGB4 GDB: 128028 INTEGRIN, BETA-4; ITGB4 KRT10GDB: 118828 KERATIN 10; KRT10 KRT12 GDB: 5583953 CORNEAL DYSTROPHY,JUVENILE EPITHELIAL, OF MEESMANN KERATIN 12; KRT12 KRT13 GDB: 120740KERATIN 13; KRT13 KRT14 GDB: 132145 KERATIN 14; KRT14 GLUTATHIONESYNTHETASE; GSS KRT14L1 GDB: 120121 KERATIN 14; KRT14 KRT14L2 GDB:120122 KERATIN 14; KRT14 KRT14L3 GDB: 120123 KERATIN 14; KRT14 KRT16GDB: 136207 KERATIN 16; KRT16 KRT16L1 GDB: 120125 KERATIN 16; KRT16KRT16L2 GDB: 120126 KERATIN 16; KRT16 KRT17 GDB: 136211 KERATIN 17;KRT17 PACHYONYCHIA CONGENITA, JACKSON-LAWLER TYPE KRT9 GDB: 303970HYPERKERATOSIS, LOCALIZED EPIDERMOLYTIC MAPT GDB: 119434MICROTUBULE-ASSOCIATED PROTEIN TAU; MAPT PALLIDOPONTONIGRALDEGENERATION; PPND DISINHIBITION-DEMENTIA-PARKINSONISM- AMYOTROPHYCOMPLEX; DDPAC MDB GDB: 9958959 MEDULLOBLASTOMA; MDB MDCR GDB: 120525MILLER-DIEKER LISSENCEPHALY SYNDROME; MDLS PLATELET-ACTIVATING FACTORACETYLHYDROLASE, GAMMA SUBUNIT MGI GDB: 9954550 MYASTHENIA GRAVIS,FAMILIAL INFANTILE; FIMG MHS2 GDB: 132580 MALIGNANT HYPERTHERMIASUSCEPTIBILITY-2; MHS2 MKS1 GDB: 681967 MECKEL SYNDROME; MKS MPO GDB:120192 MYELOPEROXIDASE DEFICIENCY MUL GDB: 636050 MULIBREY NANISM; MULMYO15A GDB: 9838006 DEAFNESS, NEUROSENSORY, AUTOSOMAL RECESSIVE, 3;DFNB3 NAGLU GDB: 636533 MUCOPOLYSACCHARIDOSIS TYPE IIIB NAPB GDB:9954572 NEURITIS WITH BRACHIAL PREDILECTION; NAPB NF1 GDB: 120231NEUROFIBROMATOSIS, TYPE I; NF1 NME1 GDB: 127965 NON-METASTATIC CELLS 1,PROTEIN EXPRESSED IN; NME1 P4HB GDB: 120708 PROLYL-4-HYDROXYLASE, BETAPOLYPEPTIDE; PHDB; PROHB PAFAH1B1 GDB: 677430 MILLER-DIEKERLISSENCEPHALY SYNDROME; MDLS PLATELET-ACTIVATING FACTOR ACETYLHYDROLASE,GAMMA SUBUNIT PECAM1 GDB: 696372 PLATELET-ENDOTHELIAL CELL ADHESIONMOLECULE; PECAM1 PEX12 GDB: 6155804 ZELLWEGER SYNDROME; ZS PEROXIN-12;PEX12 PHB GDB: 126600 PROHIBITIN; PHB PMP22 GDB: 134190CHARCOT-MARIE-TOOTH DISEASE, TYPE 1A; CMT1A HYPERTROPHIC NEUROPATHY OFDEJERINE-SOTTAS PERIPHERAL MYELIN PROTEIN 22; PMP22 PRKAR1A GDB: 120313MYXOMA, SPOTTY PIGMENTATION, AND ENDOCRINE OVERACTIVITY PROTEIN KINASE,cAMP-DEPENDENT, REGULATORY, TYPE I, ALPHA; PRKAR1A PRKCA GDB: 128015PROTEIN KINASE C, ALPHA; PRKCA PRKWNK4 GDB: 9954566PSEUDOHYPOALDOSTERONISM TYPE II, LOCUS B; PHA2B PRP8 GDB: 9957697RETINITIS PIGMENTOSA-13; RP13 PRPF8 GDB: 392647 RETINITIS PIGMENTOSA-13;RP13 PTLAH GDB: 9957342 APLASIA OR HYPOPLASIA RARA GDB: 120337 RETINOICACID RECEPTOR, ALPHA; RARA RCV1 GDB: 135477 RECOVERIN; RCV1 RMSA1 GDB:304519 REGULATOR OF MITOTIC SPINDLE ASSEMBLY 1; RMSA1 RP17 GDB: 683199RETINITIS PIGMENTOSA-17; RP17 RSS GDB: 439249 RUSSELL-SILVER SYNDROME;RSS SCN4A GDB: 125181 PERIODIC PARALYSIS II SERPINF2 GDB: 120301 PLASMININHIBITOR DEFICIENCY SGCA GDB: 384077 ADHALIN; ADL SGSH GDB: 1319101MUCOPOLYSACCHARIDOSIS TYPE IIIA SHBG GDB: 125280 SEX HORMONE BINDINGGLOBULIN; SHBG SLC2A4 GDB: 119997 SOLUTE CARRIER FAMILY 2, MEMBER 4;SLC2A4 SLC4A1 GDB: 119874 SOLUTE CARRIER FAMILY 4, ANION EXCHANGER,MEMBER 1; SLC4A1 BLOOD GROUP-DIEGO SYSTEM; DI BLOOD GROUP-WRIGHTANTIGEN; Wr ELLIPTOCYTOSIS, RHESUS-UNLINKED TYPE HEREDITARY HEMOLYTICSLC6A4 GDB: 134713 SOLUTE CARRIER FAMILY 6, MEMBER 4; SLC6A4 SMCR GDB:120379 SMITH-MAGENIS SYNDROME; SMS SOST GDB: 10450629 SCLEROSTEOSIS SOX9GDB: 134730 DYSPLASIA SSTR2 GDB: 134186 SOMATOSTATIN RECEPTOR-2; SSTR2SYM1 GDB: 512174 SYMPHALANGISM, PROXIMAL; SYM1 SYNS1 GDB: 9862343SYNOSTOSES, MULTIPLE, WITH BRACHYDACTYLY TCF2 GDB: 125298 TRANSCRIPTIONFACTOR-2, HEPATIC; TCF2 THRA GDB: 120730 THYROID HORMONE RECEPTOR, ALPHA1; THRA TIMP2 GDB: 132612 TISSUE INHIBITOR OF METALLOPROTEINASE-2; TIMP2TOC GDB: 451978 TYLOSIS WITH ESOPHAGEAL CANCER; TOC TOP2A GDB: 118884TOPOISOMERASE (DNA) II, ALPHA; TOP2A TP53 GDB: 120445 CANCER,HEPATOCELLULAR LI-FRAUMENI SYNDROME; LFS TUMOR PROTEIN p53; TP53CARCINOMA VBCH GDB: 9954554 HYPEROSTOSIS CORTICALIS GENERALISATA

TABLE 19 Genes, Locations and Genetic Disorders on Chromosome 18 GeneGDB Accession ID OMIM Link ATP8B1 GDB: 453352 CHOLESTASIS, PROGRESSIVEFAMILIAL INTRAHEPATIC 1; PFIC1 INTRAHEPATIC CHOLESTASIS FAMILIALINTRAHEPATIC CHOLESTASIS-1; FIC1 BCL2 GDB: 119031 B-CELL CLL/LYMPHOMA 2;BCL2 CNSN GDB: 9954580 CARNOSINEMIA CORD1 GDB: 118773 CONE-RODDYSTROPHY-1; CORD1 CYB5 GDB: 125236 METHEMOGLOBINEMIA DUE TO DEFICIENCYOF CYTOCHROME b5 DCC GDB: 119838 DELETED IN COLORECTAL CARCINOMA; DCCF5F8D GDB: 6919858 FACTOR V AND FACTOR VIII, COMBINED DEFICIENCY OF;F5F8D FECH GDB: 127282 PROTOPORPHYRIA, ERYTHROPOIETIC FEO GDB: 4378120POLYOSTOTIC OSTEOLYTIC DYSPLASIA, HEREDITARY EXPANSILE; HEPOD LAMA3 GDB:251818 LAMININ, ALPHA 3; LAMA3 LCFS2 GDB: 9954578 CANCER MADH4 GDB:4642788 POLYPOSIS, JUVENILE INTESTINAL MOTHERS AGAINST DECAPENTAPLEGIC,DROSOPHILA, HOMOLOG OF, 4; MADH4 MAFD1 GDB: 120163 MANIC-DEPRESSIVEPSYCHOSIS, AUTOSOMAL MC2R GDB: 135163 ADRENAL UNRESPONSIVENESS TO ACTHMCL GDB: 9954574 LEIOMYOMATA, HEREDITARY MULTIPLE, OF SKIN MYP2 GDB:9862232 MYOPIA NPC1 GDB: 138178 NIEMANN-PICK DISEASE, TYPE C1; NPC1 SPPKGDB: 606444 PALMOPLANTARIS STRIATA TGFBRE GDB: 250852 TRANSFORMINGGROWTH FACTOR, BETA 1 RESPONSE ELEMENT TGIF GDB: 9787150HOLOPROSENCEPHALY, TYPE 4; HPE4 TTR GDB: 119471 TRANSTHYRETIN; TTR

TABLE 20 Genes, Locations and Genetic Disorders on Chromosome 19 GeneGDB Accession ID OMIM Link AD2 GDB: 118748 ALZHEIMER DISEASE-2; AD2 AMHGDB: 118996 PERSISTENT MULLERIAN DUCT SYNDROME, TYPES I AND II; PMDSANTI-MULLERIAN HORMONE; AMH APOC2 GDB: 119689 APOLIPOPROTEIN C-IIDEFICIENCY, TYPE I HYPERLIPOPROTEINEMIA DUE TO APOE GDB: 119691APOLIPOPROTEIN E; APOE ATHS GDB: 128803 LIPOPROTEIN PHENOTYPE; ALP BAXGDB: 228082 BCL2-ASSOCIATED X PROTEIN; BAX BCKDHA GDB: 119723 MAPLESYRUP URINE DISEASE BCL3 GDB: 120561 B-CELL LEUKEMIA/LYMPHOMA-3; BCL3BFIC GDB: 9954584 BENIGN FAMILIAL INFANTILE CONVULSIONS C3 GDB: 119044COMPLEMENT COMPONENT-3; C3 CACNA1A GDB: 126432 ATAXIA, PERIODICVESTIBULOCEREBELLAR HEMIPLEGIC MIGRAINE, FAMILIAL; MHP SPINOCEREBELLARATAXIA 6; SCA6 CALCIUM CHANNEL, VOLTAGE-DEPENDENT, P/Q TYPE, ALPHA 1ASUBUNIT; CACNA1A CCO GDB: 119755 CENTRAL CORE DISEASE OF MUSCLE CEACAM5GDB: 119054 CARCINOEMBRYONIC ANTIGEN; CEA COMP GDB: 344263 EPIPHYSEALDYSPLASIA, MULTIPLE; MED PSEUDOACHONDROPLASTIC DYSPLASIA CARTILAGEOLIGOMERIC MATRIX PROTEIN; COMP CRX GDB: 333932 CONE-ROD DYSTROPHY-2;CORD2 AMAUROSIS CONGENITA OF LEBER I CONE-ROD HOMEO BOX-CONTAINING GENEDBA GDB: 9600353 ANEMIA, CONGENITAL HYPOPLASTIC, OF BLACKFAN AND DIAMONDDDU GDB: 10796026 URTICARIA; DDU DFNA4 GDB: 606540 DEAFNESS, AUTOSOMALDOMINANT NONSYNDROMIC SENSORINEURAL, 4; DFNA4 DLL3 GDB: 9959026VERTEBRAL ANOMALIES DMPK GDB: 119097 DYSTROPHIA MYOTONICA; DM DMWD GDB:7178354 DYSTROPHIA MYOTONICA; DM DPD1 GDB: 10796170 ENGELMANN DISEASEE11S GDB: 119101 ECHO 11 SENSITIVITY; E11S ELA2 GDB: 118792 ELASTASE-2;ELA2 NEUTROPENIA, CYCLIC EPOR GDB: 125242 ERYTHROPOIETIN RECEPTOR; EPORERCC2 GDB: 119112 EXCISION-REPAIR, COMPLEMENTING DEFECTIVE, IN CHINESEHAMSTER, 2; ERCC2 XERODERMA PIGMENTOSUM IV; XP4 ETFB GDB: 119887ELECTRON TRANSFER FLAVOPROTEIN, BETA POLYPEPTIDE; ETFB EXT3 GDB: 383780EXOSTOSES, MULTIPLE, TYPE III; EXT3 EYCL1 GDB: 119269 EYE COLOR-1; EYCL1FTL GDB: 119234 FERRITIN LIGHT CHAIN; FTL FUT1 GDB: 120618FUCOSYLTRANSFERASE-1; FUT1 FUT2 GDB: 120619 FUCOSYLTRANSFERASE-2; FUT2FUT6 GDB: 135180 FUCOSYLTRANSFERASE-6; FUT6 GAMT GDB: 1313736GUANIDINOACETATE METHYLTRANSFERASE; GAMT GCDH GDB: 136004GLUTARICACIDEMIA I GPI GDB: 120015 GLUCOSEPHOSPHATE ISOMERASE; GPI GUSMGDB: 119291 GLUCURONIDASE, MOUSE, MODIFIER OF; GUSM HB1 GDB: 9954586BUNDLE BRANCH BLOCK HCL1 GDB: 119304 HAIR COLOR-1; HCL1 HHC2 GDB: 249836HYPOCALCIURIC HYPERCALCEMIA, FAMILIAL, TYPE II; HHC2 HHC3 GDB: 9955121HYPOCALCIURIC HYPERCALCEMIA, FAMILIAL, TYPE III; HHC3 ICAM3 GDB: 136236INTERCELLULAR ADHESION MOLECULE-3; ICAM3 INSR GDB: 119352 INSULINRECEPTOR; INSR JAK3 GDB: 376460 JANUS KINASE 3 JAK3 KLK3 GDB: 119695ANTIGEN, PROSTATE-SPECIFIC; APS LDLR GDB: 119362 HYPERCHOLESTEROLEMIA,FAMILIAL; FHC LHB GDB: 119364 LUTEINIZING HORMONE, BETA POLYPEPTIDE; LHBLIG1 GDB: 127274 LIGASE I, DNA, ATP-DEPENDENT; LIG1 LOH19CR1 GDB:9837482 ANEMIA, CONGENITAL HYPOPLASTIC, OF BLACKFAN AND DIAMOND LYL1GDB: 120158 LEUKEMIA, LYMPHOID, 1; LYL1 MAN2B1 GDB: 119376 MANNOSIDOSIS,ALPHA B, LYSOSOMAL MCOLN1 GDB: 10013974 MUCOLIPIDOSIS IV MDRV GDB:6306714 MUSCULAR DYSTROPHY, AUTOSOMAL DOMINANT, WITH RIMMED VACUOLES;MDRV MLLT1 GDB: 136791 MYELOID/LYMPHOID OR MIXED LINEAGE LEUKEMIA,TRANSLOCATED TO, 1; MLLT1 NOTCH3 GDB: 361163 DEMENTIA, HEREDITARYMULTI-INFARCT TYPE NOTCH, DROSOPHILA, HOMOLOG OF, 3; NOTCH3 NPHS1 GDB:342105 NEPHROSIS 1, CONGENITAL, FINNISH TYPE; NPHS1 OFC3 GDB: 128060OROFACIAL CLEFT-3; OFC3 OPA3 GDB: 9954590 OPTIC ATROPHY, INFANTILE, WITHCHOREA AND SPASTIC PARAPLEGIA PEPD GDB: 120273 PEPTIDASE D; PEPD PRPF31GDB: 333911 RETINITIS PIGMENTOSA 11; RP11 PRTN3 GDB: 126876 PROTEINASE3; PRTN3; PR3 PRX GDB: 11501256 HYPERTROPHIC NEUROPATHY OFDEJERINE-SOTTAS PSG1 GDB: 120321 PREGNANCY-SPECIFIC BETA-1-GLYCOPROTEIN1; PSG1 PVR GDB: 120324 POLIOVIRUS SUSCEPTIBILITY, OR SENSITIVITY; PVSRYR1 GDB: 120359 CENTRAL CORE DISEASE OF MUSCLE HYPERTHERMIA OFANESTHESIA RYANODINE RECEPTOR-1; RYR1 SLC5A5 GDB: 5892184 SOLUTE CARRIERFAMILY 5, MEMBER 5; SLC5A5 SLC7A9 GDB: 9958852 CYSTINURIA, TYPE III;CSNU3 STK11 GDB: 9732383 PEUTZ-JEGHERS SYNDROME SERINE/THREONINE PROTEINKINASE 11; STK11 TBXA2R GDB: 127517 THROMBOXANE A2 RECEPTOR, PLATELET;TBXA2R TGFB1 GDB: 120729 ENGELMANN DISEASE TRANSFORMING GROWTH FACTOR,BETA-1; TGFB1 TNNI3 GDB: 125309 TROPONIN I, CARDIAC; TNNI3 TYROBP GDB:9954457 POLYCYSTIC LIPOMEMBRANOUS OSTEODYSPLASIA WITH SCLEROSINGLEUKOENCEPHALOPATHY

TABLE 21 Genes, Locations and Genetic Disorders on Chromosome 20 GeneGDB Accession ID OMIM Link ADA GDB: 119649 ADENOSINE DEAMINASE; ADA AHCYGDB: 118983 S-ADENOSYLHOMOCYSTEINE HYDROLASE; AHCY AVP GDB: 119009DIABETES INSIPIDUS, NEUROHYPOPHYSEAL TYPE ARGININE VASOPRESSIN; AVPCDAN2 GDB: 9823270 DYSERYTHROPOIETIC ANEMIA, CONGENITAL, TYPE II CDMP1GDB: 438940 CHONDRODYSPLASIA, GREBE TYPE CARTILAGE-DERIVED MORPHOGENETICPROTEIN 1 CHED1 GDB: 3837719 CORNEAL DYSTROPHY, CONGENITAL ENDOTHELIAL;CHED CHRNA4 GDB: 128169 CHOLINERGIC RECEPTOR, NEURONAL NICOTINIC, ALPHAPOLYPEPTIDE 4; CHRNA4 EPILEPSY, BENIGN NEONATAL; EBN1 CST3 GDB: 119817AMYLOIDOSIS VI EDN3 GDB: 119862 ENDOTHELIN-3; EDN3 WAARDENBURG-SHAHSYNDROME EEGV1 GDB: 127525 ELECTROENCEPHALOGRAM, LOW-VOLTAGE FTLL1 GDB:119235 FERRITIN LIGHT CHAIN; FTL GNAS GDB: 120628 GUANINENUCLEOTIDE-BINDING PROTEIN, ALPHA-STIMULATING POLYPEPTIDE; GSS GDB:637022 GLUTATHIONE SYNTHETASE DEFICIENCY OF ERYTHROCYTES, HEMOLYTICANEMIA PYROGLUTAMICACIDURIA HNF4AGDB: 393281DIABETES MELLITUS, AUTOSOMALDOMINANT TRANSCRIPTION FACTOR 14, HEPATIC NUCLEAR FACTOR; TCF14 JAG1GDB: 6175920 CHOLESTASIS WITH PERIPHERAL PULMONARY STENOSIS JAGGED 1;JAG1 KCNQ2 GDB: 9787229 EPILEPSY, BENIGN NEONATAL; EBN1 POTASSIUMCHANNEL, VOLTAGE-GATED, SUBFAMILY Q, MEMBER 2 MKKS GDB: 9860197HYDROMETROCOLPOS SYNDROME NBIA1 GDB: 4252819 HALLERVORDEN-SPATZ DISEASEPCK1 GDB: 125349 PHOSPHOENOLPYRUVATE CARBOXYKINASE 1, SOLUBLE; PCK1 PI3GDB: 203940 PROTEINASE INHIBITOR 3; PI3 PPGB GDB: 119507 NEURAMINIDASEDEFICIENCY WITH BETA-GALACTOSIDASE DEFICIENCY PPMD GDB: 702144 CORNEALDYSTROPHY, HEREDITARY POLYMORPHOUS POSTERIOR; PPCD PRNP GDB: 120720GERSTMANN-STRAUSSLER DISEASE; GSD PRION PROTEIN; PRNP THBD GDB: 119613THROMBOMODULIN; THBD TOP1 GDB: 120444 TOPOISOMERASE (DNA) I; TOP1

TABLE 22 Genes, Locations and Genetic Disorders on Chromosome 21 GDBGene Accession ID OMIM Link AIRE GDB: 567198 AUTOIMMUNEPOLYENDOCRINOPATHY- CANDIDIASIS-ECTODERMAL DYSTROPHY; APECED APP GDB:119692 ALZHEIMER DISEASE; AD AMYLOID BETA A4 PRECURSOR PROTEIN; APP CBSGDB: 119754 HOMOCYSTINURIA COL6A1 GDB: 119065 COLLAGEN, TYPE VI, ALPHA-1CHAIN; COL6A1 MYOPATHY, BENIGN CONGENITAL, WITH CONTRACTURES COL6A2 GDB:119793 COLLAGEN, TYPE VI, ALPHA-2 CHAIN; COL6A2 MYOPATHY, BENIGNCONGENITAL, WITH CONTRACTURES CSTB GDB: 5215249 MYOCLONUS EPILEPSY OFUNVERRICHT AND LUNDBORG CYSTATIN B; CSTB DCR GDB: 125354 TRISOMY 21DSCR1 GDB: 731000 TRISOMY 21 FPDMM GDB: 9954610 CORE-BINDING FACTOR,RUNT DOMAIN, ALPHA SUBUNIT 2; CBFA2 PLATELET DISORDER, FAMILIAL, WITHASSOCIATED MYELOID MALIGNANCY HLCS GDB: 392648 MULTIPLE CARBOXYLASEDEFICIENCY, BIOTIN-RESPONSIVE; MCD HPE1 GDB: 136065 HOLOPROSENCEPHALY,FAMILIAL ALOBAR ITGB2 GDB: 120574 INTEGRIN BETA-2; ITGB2 KCNE1 GDB:127909 POTASSIUM VOLTAGE-GATED CHANNEL, ISK-RELATED SUBFAMILY, MEMBER 1;KNO GDB: 4073044 KNOBLOCH SYNDROME; KNO PRSS7 GDB: 384083 ENTEROKINASEDEFICIENCY RUNX1 GDB: 128313 CORE-BINDING FACTOR, RUNT DOMAIN, ALPHASUBUNIT 2; CBFA2 PLATELET DISORDER, FAMILIAL, WITH ASSOCIATED MYELOIDMALIGNANCY SOD1 GDB: 119596 AMYOTROPHIC LATERAL SCLEROSIS SUPEROXIDEDISMUTASE-1; SOD1 MUSCULAR ATROPHY, PROGRESSIVE, WITH AMYOTROPHICLATERAL SCLEROSIS TAM GDB: 9958709 MYELOPROLIFERATIVE SYNDROME,TRANSIENT

TABLE 23 Genes, Locations and Genetic Disorders on Chromosome 22 GDBGene Accession ID OMIM Link ADSL GDB: 119655 ADENYLOSUCCINATE LYASE;ADSL ARSA GDB: 119007 METACHROMATIC LEUKODYSTROPHY, LATE-INFANTILE BCRGDB: 120562 BREAKPOINT CLUSTER REGION; BCR CECR GDB: 119772 CAT EYESYNDROME; CES CHEK2 GDB: 9958730 LI-FRAUMENI SYNDROME; LFS OSTEOGENICSARCOMA COMT GDB: 119795 CATECHOL-O- METHYLTRANSFERASE; COMT CRYBB2 GDB:119075 CRYSTALLIN, BETA B2; CRYBB2 CATARACT, CONGENITAL, CERULEAN TYPE,2; CCA2 CSF2RB GDB: 126838 GRANULOCYTE-MACROPHAGE COLONY-STIMULATINGFACTOR RECEPTOR, BETA SUBUNIT; CTHM GDB: 439247 HEART MALFORMATIONS;CTHM CYP2D6 GDB: 132127 CYTOCHROME P450, SUBFAMILY IID; CYP2D CYP2D@GDB: 119832 CYTOCHROME P450, SUBFAMILY IID; CYP2D DGCR GDB: 119843DIGEORGE SYNDROME; DGS DIA1 GDB: 119848 METHEMOGLOBINEMIA DUE TODEFICIENCY OF METHEMOGLOBIN REDUCTASE EWSR1 GDB: 135984 EWING SARCOMA;EWS GGT1 GDB: 120623 GLUTATHIONURIA MGCR GDB: 120180 MENINGIOMA; MGM MN1GDB: 580528 MENINGIOMA; MGM NAGA GDB: 119445 ALPHA-GALACTOSIDASE B; GALBNF2 GDB: 120232 NEUROFIBROMATOSIS, TYPE II; NF2 OGS2 GDB: 9954619HYPERTELORISM WITH ESOPHAGEAL ABNORMALITY AND HYPOSPADIAS PDGFB GDB:120709 V-SIS PLATELET-DERIVED GROWTH FACTOR BETA POLYPEPTIDE; PDGFBPPARA GDB: 202877 PEROXISOME PROLIFERATOR ACTIVATED RECEPTOR, ALPHA;PPARA PRODH GDB: 5215168 HYPERPROLINEMIA, TYPE I SCO2 GDB: 9958568CYTOCHROME c OXIDASE DEFICIENCY SCZD4 GDB: 1387047 SCHIZOPHRENIADISORDER-4; SCZD4 SERPIND1 GDB: 120038 HEPARIN COFACTOR II; HCF2 SLC5A1GDB: 120375 SOLUTE CARRIER FAMILY 5, MEMBER 1; SLC5A1 SOX10 GDB: 9834028SRY-BOX 10; SOX10 TCN2 GDB: 119608 TRANSCOBALAMIN II DEFICIENCY TIMP3GDB: 138175 TISSUE INHIBITOR OF METALLOPROTEINASE-3; TIMP3 VCF GDB:136422 VELOCARDIOFACIAL SYNDROME

TABLE 24 Genes, Locations and Genetic Disorders on Chromosome X Gene GDBAccession ID OMIM Link ABCD1 GDB: 118991 ADRENOLEUKODYSTROPHY; ALD ACTL1GDB: 119648 ACTIN-LIKE SEQUENCE-1; ACTL1 ADFN GDB: 118977ALBINISM-DEAFNESS SYNDROME; ADFN; ALDS AGMX2 GDB: 119661AGAMMAGLOBULINEMIA, X-LINKED, TYPE 2; AGMX2; XLA2 AHDS GDB: 125899MENTAL RETARDATION, X-LINKED, WITH HYPOTONIA AIC GDB: 118986 CORPUSCALLOSUM, AGENESIS OF, WITH CHORIORETINAL ABNORMALITY AIED GDB: 119663ALBINISM, OCULAR, TYPE 2; OA2 AIH3 GDB: 131443 AMELOGENESISIMPERFECTA-3, HYPOPLASTIC TYPE; AIH3 ALAS2 GDB: 119666 ANEMIA,HYPOCHROMIC AMCD GDB: 5584286 ARTHROGRYPOSIS MULTIPLEX CONGENITA, DISTALAMELX GDB: 119675 AMELOGENESIS IMPERFECTA-1, HYPOPLASTIC TYPE; AIH1ANOP1 GDB: 128454 CLINICAL; ANOP1 AR GDB: 120556 ANDROGEN INSENSITIVITYSYNDROME; AIS ANDROGEN RECEPTOR; AR ARAF1 GDB: 119004 V-RAF MURINESARCOMA 3611 VIRAL ONCOGENE HOMOLOG 1; ARAF1 ARSC2 GDB: 119702ARYLSULFATASE C, f FORM; ARSC2 ARSE GDB: 555743 CHONDRODYSPLASIAPUNCTATA 1, X-LINKED RECESSIVE; CDPX1 ARTS GDB: 9954651 FATAL X-LINKED,WITH DEAFNESS AND LOSS OF VISION ASAT GDB: 9954649 SIDEROBLASTIC, ANDSPINOCEREBELLAR ATAXIA; ASAT ASSP5 GDB: 119019 CITRULLINEMIA ATP7A GDB:119395 ATPase, Cu(2+)-TRANSPORTING, ALPHA POLYPEPTIDE; ATP7A MENKESSYNDROME ATRX GDB: 136052 ALPHA-THALASSEMIA/MENTAL RETARDATION SYNDROME,X-LINKED; ATRX ALPHA-THALASSEMIA/MENTAL RETARDATION SYNDROME,NONDELETION TYPE AVPR2 GDB: 131475 DIABETES INSIPIDUS, NEPHROGENIC BFLSGDB: 120566 BORJESON SYNDROME; BORJ BGN GDB: 119727 BIGLYCAN; BGN BTKGDB: 120542 BRUTON AGAMMAGLOBULINEMIA TYROSINE KINASE; BTK BZX GDB:5205912 BAZEX SYNDROME; BZX C1HR GDB: 119040 TATA BOX BINDING PROTEIN(TBP)-ASSOCIATED FACTOR 2A; TAF2A CACNA1F GDB: 6053864 NIGHTBLINDNESS,CONGENITAL STATIONARY, X-LINKED, TYPE 2; CSNB2 CALCIUM CHANNEL,VOLTAGE-DEPENDENT, ALPHA 1F SUBUNIT; CACNA1F CALB3 GDB: 133780 CALBINDIN3; CALB3 CBBM GDB: 9958963 COLORBLINDNESS, BLUE-MONO-CONE-MONOCHROMATICTYPE; CBBM CCT GDB: 119756 CATARACT, CONGENITAL TOTAL, WITH POSTERIORSUTURAL OPACITIES IN HETEROZYGOTES; CDR1 GDB: 119053 CEREBELLARDEGENERATION-RELATED AUTOANTIGEN-1; CDR1; CDR34 CFNS GDB: 9579470CRANIOFRONTONASAL SYNDROME; CFNS CGF1 GDB: 6275867 COGNITION CHM GDB:120400 CHOROIDEREMIA; CHM CHR39C GDB: 119779 CHOLESTEROL REPRESSIBLEPROTEIN 39C; CHR39C CIDX GDB: 127736 SEVERE COMBINED IMMUNODEFICIENCYDISEASE, X-LINKED, 2; SCIDX2 CLA2 GDB: 119782 CEREBELLAR ATAXIA,X-LINKED; CLA2 CLCN5 GDB: 270667 CHLORIDE CHANNEL 5; CLCN5 FANCONISYNDROME, RENAL, WITH NEPHROCALCINOSIS AND RENAL STONES NEPHROLITHIASIS,X-LINKED RECESSIVE, WITH RENAL FAILURE; XRN CLS GDB: 119784 RIBOSOMALPROTEIN S6 KINASE, 90 KD, POLYPEPTIDE 3; RPS6KA3 COFFIN-LOWRY SYNDROME;CLS CMTX2 GDB: 128311 CHARCOT-MARIE-TOOTH NEUROPATHY, X-LINKEDRECESSIVE, 2; CMTX2 CMTX3 GDB: 128151 CHARCOT-MARIE-TOOTH NEUROPATHY,X-LINKED RECESSIVE, 3; CMTX3 CND GDB: 9954627 DERMOIDS OF CORNEA; CNDCOD1 GDB: 119787 CONE DYSTROPHY, X-LINKED, 1; COD1 COD2 GDB: 6520166CONE DYSTROPHY, X-LINKED, 2; COD2 COL4A5 GDB: 120596 COLLAGEN, TYPE IV,ALPHA-5 CHAIN; COL4A5 LEIOMYOMATOSIS, ESOPHAGEAL AND VULVAL, WITHNEPHROPATHY COL4A6 GDB: 222775 COLLAGEN, TYPE IV, ALPHA-6 CHAIN; COL4A6LEIOMYOMATOSIS, ESOPHAGEAL AND VULVAL, WITH NEPHROPATHY CPX GDB: 120598CLEFT PALATE, X-LINKED; CPX CVD1 GDB: 9954659 CARDIAC VALVULARDYSPLASIA, X-LINKED CYBB GDB: 120513 GRANULOMATOUS DISEASE, CHRONIC; CGDDCX GDB: 9823272 LISSENCEPHALY, X-LINKED DFN2 GDB: 119091 DEAFNESS,X-LINKED 2, PERCEPTIVE CONGENITAL; DFN2 DFN4 GDB: 433255 DEAFNESS,X-LINKED 4, CONGENITAL SENSORINEURAL; DFN4 DFN6 GDB: 1320698 DEAFNESS,X-LINKED, 6, PROGRESSIVE; DFN6 DHOF GDB: 119847 FOCAL DERMAL HYPOPLASIA;DHOF DIAPH2 GDB: 9835484 DIAPHANOUS, DROSOPHILA, HOMOLOG OF, 2 DKC1GDB:119096 DYSKERATOSIS CONGENITA; DKC DMD GDB: 119850 MUSCULAR DYSTROPHY,PSEUDOHYPERTROPHIC PROGRESSIVE, DUCHENNE AND BECKER DSS GDB: 433750DOSAGE-SENSITIVE SEX REVERSAL; DSS DYT3 GDB: 118789 TORSION DYSTONIA-3,X-LINKED TYPE; DYT3 EBM GDB: 119102 BULLOUS DYSTROPHY, HEREDITARYMACULAR TYPE EBP GDB: 125212 CHONDRODYSPLASIA PUNCTATA, X-LINKEDDOMINANT; CDPX2; CDPXD; CPXD ED1 GDB: 119859 ECTODERMAL DYSPLASIA,ANHIDROTIC; EDA ELK1 GDB: 119867 ELK1, MEMBER OF ETS ONCOGENE FAMILY;ELK1 EMD GDB: 119108 MUSCULAR DYSTROPHY, TARDIVE, DREIFUSS-EMERY TYPE,WITH CONTRACTURES EVR2 GDB: 136068 EXUDATIVE VITREORETINOPATHY,FAMILIAL, X-LINKED RECESSIVE; EVR2 F8C GDB: 119124 HEMOPHILIA A F9 GDB:119900 HEMOPHILIA B; HEMB FCP1 GDB: 347490 F-CELL PRODUCTION, X-LINKED;FCPX FDPSL5 GDB: 119922 SYNTHETASE-5; FPSL5 FGD1 GDB: 119131 SYNDROMEFACIOGENITAL DYSPLASIA; FGDY FGS1 GDB: 9836950 FG SYNDROME FMR1 GDB:129038 FRAGILE SITE MENTAL RETARDATION-1; FMR1 FMR2 GDB: 141566 FRAGILESITE, FOLIC ACID TYPE, RARE, FRA(X)(q28); FRAXE G6PD GDB: 120621GLUCOSE-6-PHOSPHATE DEHYDROGENASE; G6PD GABRA3 GDB: 119968GAMMA-AMINOBUTYRIC ACID RECEPTOR, ALPHA-3; GABRA3 GATA1 GDB: 125373GATA-BINDING PROTEIN 1; GATA1 GDI1 GDB: 1347097 GDP DISSOCIATIONINHIBITOR 1; GDI1 MENTAL RETARDATION, X-LINKED NONSPECIFIC, TYPE 3; MRX3GDXY GDB: 9954629 DYSGENESIS, XY FEMALE TYPE; GDXY GJB1 GDB: 125246CHARCOT-MARIE-TOOTH PERONEAL MUSCULAR ATROPHY, X-LINKED; CMTX1 GAPJUNCTION PROTEIN, BETA-1, 32 KD; GJB1 GK GDB: 119271 HYPERGLYCEROLEMIAGLA GDB: 119272 ANGIOKERATOMA, DIFFUSE GPC3 GDB: 3770726 GLYPICAN-3;GPC3 SIMPSON DYSMORPHIA SYNDROME; SDYS GRPR GDB: 128035GASTRIN-RELEASING PEPTIDE RECEPTOR; GRPR GTD GDB: 9954635 GONADOTROPINDEFICIENCY; GTD GUST GDB: 9954655 MENTAL RETARDATION WITH OPTIC ATROPHY,DEAFNESS, AND SEIZURES HMS1 GDB: 251827 1; HMS1 HPRT1 GDB: 119317HYPOXANTHINE GUANINE PHOSPHORIBOSYLTRANSFERASE 1; HPRT1 HPT GDB: 119322HYPOPARATHYROIDISM, X-LINKED; HYPX HTC2 GDB: 700980 HYPERTRICHOSIS,CONGENITAL GENERALIZED; CGH; HCG HTR2C GDB: 378202 5-@HYDROXYTRYPTAMINERECEPTOR 2C; HTR2C HYR GDB: 9954625 REGULATOR; HYR IDS GDB: 120521MUCOPOLYSACCHARIDOSIS TYPE II IHG1 GDB: 119343 HYPOPLASIA OF, WITHGLAUCOMA; IHG IL2RG GDB: 134807 INTERLEUKIN-2 RECEPTOR, GAMMA; IL2RGSEVERE COMBINED IMMUNODEFICIENCY DISEASE, X-LINKED, 2; SCIDX2 INDX GDB:9954657 IMMUNONEUROLOGIC DISORDER, X-LINKED IP1 GDB: 120105INCONTINENTIA PIGMENTI, TYPE I; IP1 IP2 GDB: 120106 INCONTINENTIAPIGMENTI, TYPE II; IP2 JMS GDB: 204055 MENTAL RETARDATION, X-LINKED,WITH GROWTH RETARDATION, DEAFNESS, AND KAL1 GDB: 120116 KALLMANNSYNDROME 1; KAL1 KFSD GDB: 128174 KERATOSIS FOLLICULARIS SPINULOSADECALVANS CUM OPHIASI; KFSD L1CAM GDB: 120133 CLASPED THUMB AND MENTALRETARDATION L1 CELL ADHESION MOLECULE; L1CAM LAMP2 GDB: 125376LYSOSOME-ASSOCIATED MEMBRANE PROTEIN B; LAMP2; LAMPB MAA GDB: 119372MICROPHTHALMIA OR ANOPHTHALMOS, WITH ASSOCIATED ANOMALIES; MAA MAFD2GDB: 119373 PSYCHOSIS, X-LINKED MAOA GDB: 120164 MONOAMINE OXIDASE A;MAOA MAOB GDB: 119377 MONOAMINE OXIDASE B; MAOB MCF2 GDB: 120168 MCF.2CELL LINE DERIVED TRANSFORMING SEQUENCE; MCF2 MCS GDB: 128370 MENTALRETARDATION, X-LINKED, SYNDROMIC-4, WITH CONGENITAL CONTRACTURES MEAXGDB: 119383 X-LINKED, WITH EXCESSIVE AUTOPHAGY; XMEA; MEAX MECP2 GDB:3851454 SYNDROME; RTT MF4 GDB: 119386 METACARPAL 4-5 FUSION; MF4 MGC1GDB: 120179 MEGALOCORNEA; MGC1; MGCN MIC5 GDB: 120526 SURFACE ANTIGEN,X-LINKED; SAX MID1 GDB: 9772232 OPITZ SYNDROME MLLT7 GDB: 392309MYELOID/LYMPHOID OR MIXED-LINEAGE LEUKEMIA, TRANSLOCATED TO, 7; MLLT7MLS GDB: 262123 MICROPHTHALMIA WITH LINEAR SKIN DEFECTS; MLS MRSD GDB:119398 MENTAL RETARDATION, SKELETAL DYSPLASIA, AND ABDUCENS PALSY; MRSDMRX14 GDB: 138453 RETARDATION, X-LINKED 14; MRX14 MRX1 GDB: 120193MENTAL RETARDATION, X-LINKED NONSPECIFIC, TYPE 1; MRX1 MRX20 GDB: 217050MENTAL RETARDATION, X-LINKED 20; MRX20 MRX2 GDB: 120194 RETARDATION,X-LINKED NONSPECIFIC, TYPE 2; MRX2 MRX3 GDB: 128105 GDP DISSOCIATIONINHIBITOR 1; GDI1 MENTAL RETARDATION, X-LINKED NONSPECIFIC, TYPE 3; MRX3MRX40 GDB: 700754 MENTAL RETARDATION, X-LINKED, WITH HYPOTONIA MRXA GDB:9954641 MENTAL RETARDATION, X-LINKED NONSPECIFIC, WITH APHASIA; MRXA MSDGDB: 119399 SYNDROME MTM1 GDB: 119439 MYOTUBULAR MYOPATHY 1; MTM1 MYCL2GDB: 120209 MYCL-RELATED PROCESSED GENE; MYCL2 MYP1 GDB: 127783 MYOPIA,X-LINKED; MYP1 NDP GDB: 119449 NORRIE DISEASE; NDP NHS GDB: 120235CATARACT-DENTAL SYNDROME NPHL1 GDB: 433705 NEPHROLITHIASIS, X-LINKEDRECESSIVE, WITH RENAL FAILURE; XRN NR0B1 GDB: 118982 ADRENAL HYPOPLASIA,CONGENITAL; AHC NSX GDB: 125596 SYNDROME; NSX NYS1 GDB: 119458NYSTAGMUS, X-LINKED; NYS NYX GDB: 119814 NIGHTBLINDNESS, CONGENITALSTATIONARY, WITH MYOPIA; CSNB1 OA1 GDB: 119459 ALBINISM, OCULAR, TYPE 1;OA1 OASD GDB: 138457 OCULAR, WITH LATE-ONSET SENSORINEURAL DEAFNESS;OASD OCRL GDB: 119461 LOWE OCULOCEREBRORENAL SYNDROME; OCRL ODT1 GDB:125360 TEETH, ABSENCE OF OFD1 GDB: 120248 OROFACIODIGITAL SYNDROME 1;OFD1 OPA2 GDB: 125358 OPTIC ATROPHY 2; OPA2 OPD1 GDB: 120249OTOPALATODIGITAL SYNDROME OPEM GDB: 119467 OPHTHALMOPLEGIA, EXTERNAL,AND MYOPIA; OPEM OPN1LW GDB: 120724 COLORBLINDNESS, PARTIAL, PROTANSERIES; CBP OPN1MW GDB: 120622 COLORBLINDNESS, PARTIAL, DEUTAN SERIES;CBD; DCB OTC GDB: 119468 ORNITHINE TRANSCARBAMYLASE DEFICIENCY,HYPERAMMONEMIA DUE TO; OTC P3 GDB: 9954667 PROTEIN P3 PDHA1 GDB: 118895PYRUVATE DEHYDROGENASE COMPLEX, E1-ALPHA POLYPEPTIDE-1; PDHA1 PDR GDB:203409 AMYLOIDOSIS, FAMILIAL CUTANEOUS PFC GDB: 120275 PROPERDINDEFICIENCY, X-LINKED PFKFB1 GDB: 125375 6-@PHOSPHOFRUCTO-2-KINASE;PFKFB1 PGK1 GDB: 120282 PHOSPHOGLYCERATE KINASE 1; PGK1 PGK1P1 GDB:120283 PHOSPHOGLYCERATE KINASE 1; PGK1 PGS GDB: 128372 DANDY-WALKERMALFORMATION WITH MENTAL RETARDATION, BASAL GANGLIA DISEASE, PHEX GDB:120520 HYPOPHOSPHATEMIA, VITAMIN D-RESISTANT RICKETS; HYP PHKA1 GDB:120285 PHOSPHORYLASE KINASE, ALPHA 1 SUBUNIT (MUSCLE); PHKA1 PHKA2 GDB:127279 GLYCOGEN STORAGE DISEASE VIII PHP GDB: 119494 PANHYPOPITUITARISM;PHP PIGA GDB: 138138 PHOSPHATIDYLINOSITOL GLYCAN, CLASS A; PIGA PLP1GDB: 120302 PROTEOLIPID PROTEIN, MYELIN; PLP POF1 GDB: 120716 PREMATUREOVARIAN FAILURE 1; POF1 POLA GDB: 120304 POLYMERASE, DNA, ALPHA; POLAPOU3F4 GDB: 351386 DEAFNESS, CONDUCTIVE, WITH STAPES FIXATION PPMX GDB:9954669 RETARDATION WITH PSYCHOSIS, PYRAMIDAL SIGNS, AND MACROORCHIDISMPRD GDB: 371323 DYSPLASIA, PRIMARY PRPS1 GDB: 120318PHOSPHORIBOSYLPYROPHOSPHATE SYNTHETASE-I; PRPS1 PRPS2 GDB: 120320PHOSPHORIBOSYLPYROPHOSPHATE SYNTHETASE-II; PRPS2 PRS GDB: 128368 MENTALRETARDATION, X-LINKED, SYNDROMIC-2, WITH DYSMORPHISM AND CEREBRAL PRTSGDB: 128367 PARTINGTON X-LINKED MENTAL RETARDATION SYNDROME; PRTS PSF2GDB: 119519 TRANSPORTER 2, ABC; TAP2 RENBP GDB: 133792 RENIN-BINDINGPROTEIN; RENBP RENS1 GDB: 9806348 MENTAL RETARDATION, X-LINKED,RENPENNING TYPE RP2 GDB: 120353 RETINITIS PIGMENTOSA-2; RP2 RP6 GDB:125381 PIGMENTOSA-6; RP6 RPGR GDB: 118736 RETINITIS PIGMENTOSA-3; RP3RPS4X GDB: 128115 RIBOSOMAL PROTEIN S4, X-LINKED; RPS4X RPS6KA3 GDB:365648 RIBOSOMAL PROTEIN S6 KINASE, 90 KD, POLYPEPTIDE 3; RPS6KA3 RS1GDB: 119581 RETINOSCHISIS; RS S11 GDB: 120361 ANTIGEN, X-LINKED, SECOND;SAX2 SDYS GDB: 119590 GLYPICAN-3; GPC3 SIMPSON DYSMORPHIA SYNDROME; SDYSSEDL GDB: 120372 SPONDYLOEPIPHYSEAL DYSPLASIA, LATE; SEDL SERPINA7 GDB:120399 THYROXINE-BINDING GLOBULIN OF SERUM; TBG SH2D1A GDB: 120701IMMUNODEFICIENCY, X-LINKED PROGRESSIVE COMBINED VARIABLE SHFM2 GDB:226635 SPLIT-HAND/SPLIT-FOOT ANOMALY, X-LINKED SHOX GDB: 6118451 SHORTSTATURE; SS SLC25A5 GDB: 125190 ADENINE NUCLEOTIDE TRANSLOCATOR 2; ANT2SMAX2 GDB: 9954643 SPINAL MUSCULAR ATROPHY, X-LINKED LETHAL INFANTILESRPX GDB: 3811398 RETINITIS PIGMENTOSA-3; RP3 SRS GDB: 136337 MENTALRETARDATION, X-LINKED, SNYDER-ROBINSON TYPE STS GDB: 120393 ICHTHYOSIS,X-LINKED SYN1 GDB: 119606 SYNAPSIN I; SYN1 SYP GDB: 125295SYNAPTOPHYSIN; SYP TAF1 GDB: 120573 TATA BOX BINDING PROTEIN(TBP)-ASSOCIATED FACTOR 2A; TAF2A TAZ GDB: 120609 CARDIOMYOPATHY,DILATED 3A; CMD3A ENDOCARDIAL FIBROELASTOSIS-2; EFE2 TBX22 GDB: 10796448CLEFT PALATE, X-LINKED; CPX TDD GDB: 119610 MALE PSEUDOHERMAPHRODITISM:DEFICIENCY OF TESTICULAR 17, 20-DESMOLASE; TFE3 GDB: 125870TRANSCRIPTION FACTOR FOR IMMUNOGLOBULIN HEAVY-CHAIN ENHANCER-3; TFE3THAS GDB: 128158 THORACOABDOMINAL SYNDROME; TAS THC GDB: 125361THROMBOCYTOPENIA, X-LINKED; THC; XLT TIMM8A GDB: 119090 DEAFNESS 1,PROGRESSIVE; DFN1 TIMP1 GDB: 119615 TISSUE INHIBITOR OFMETALLOPROTEINASE-1; TIMP1 TKCR GDB: 119616 TORTICOLLIS, KELOIDS,CRYPTORCHIDISM, AND RENAL DYSPLASIA; TKC TNFSF5 GDB: 120632IMMUNODEFICIENCY WITH INCREASED IgM UBE1 GDB: 118954UBIQUITIN-ACTIVATING ENZYME 1; UBE1 UBE2A GDB: 131647UBIQUITIN-CONJUGATING ENZYME E2A; UBE2A WAS GDB: 120736 WISKOTT-ALDRICHSYNDROME; WAS WSN GDB: 125864 PARKINSONISM, EARLY-ONSET, WITH MENTALRETARDATION WTS GDB: 128373 MENTAL RETARDATION, X-LINKED, SYNDROMIC-6,WITH GYNECOMASTIA AND OBESITY; WWS GDB: 120497 WIEACKER SYNDROME XICGDB: 120498 X-INACTIVATION-SPECIFIC TRANSCRIPT; XIST XIST GDB: 126428X-INACTIVATION-SPECIFIC TRANSCRIPT; XIST XK GDB: 120499 Xk LOCUS XM GDB:119634 XM SYSTEM XS GDB: 119636 LUTHERAN SUPPRESSOR, X-LINKED; XS; LUXSZFX GDB: 120502 ZINC FINGER PROTEIN, X-LINKED; ZFX ZIC3 GDB: 249141HETEROTAXY, X-LINKED VISCERAL; HTX1 ZNF261 GDB: 9785766 MENTALRETARDATION, X-LINKED; DXS6673E ZNF41 GDB: 125865 ZINC FINGERPROTEIN-41; ZNF41 ZNF6 GDB: 120508 ZINC FINGER PROTEIN-6; ZNF6

TABLE 25 Genes, Locations and Genetic Disorders on Chromosome Y GDB GeneAccession ID OMIM Link AMELY GDB: 119676 AMELOGENIN, Y-CHROMOSOMAL;AMELY ASSP6 GDB: 119020 CITRULLINEMIA AZF1 GDB: 119027 AZOOSPERMIAFACTOR 1; AZF1 AZF2 GDB: 456131 AZOOSPERMIA FACTOR 2; AZF2 DAZ GDB:635890 DELETED IN AZOOSPERMIA; DAZ GCY GDB: 119267 CONTROL, Y-CHROMOSOMEINFLUENCED; GCY RPS4Y GDB: 128052 RIBOSOMAL PROTEIN S4, Y-LINKED; RPS4YSMCY GDB: 5875390 HISTOCOMPATIBILITY Y ANTIGEN; HY; HYA SRY GDB: 125556SEX-DETERMINING REGION Y; SRY ZFY GDB: 120503 ZINC FINGER PROTEIN,Y-LINKED; ZFY

TABLE 26 Genes, Locations and Genetic Disorders in Unknown or MultipleLocations Gene GDB Accession ID OMIM Link ABAT GDB: 581658GAMMA-AMINOBUTYRATE TRANSAMINASE AEZ GDB: 128360 ACRODERMATITISENTEROPATHICA, ZINC-DEFICIENCY TYPE; AEZ AFA GDB: 265277 FILIFORMEADNATUM AND CLEFT PALATE AFD1 GDB: 265292 DYSOSTOSIS, TREACHER COLLINSTYPE, WITHLIMB ANOMALIES AGS1 GDB: 10795417 ENCEPHALOPATHY, FAMILIALINFANTILE, WITH CALCIFICATION OF BASAL GANGLIA ASAH GDB: 6837715 FARBERLIPOGRANULOMATOSIS ASD1 GDB: 6276019 ATRIAL SEPTAL DEFECT; ASD ASMT GDB:136259 CETYLSEROTONIN METHYLTRANSFERASE; ASMT ACETYLSEROTONINMETHYLTRANSFERASE, Y-CHROMOSOMAL; ASMTY; HIOMTY BCH GDB: 118758 CHOREA,HEREDITARY BENIGN; BCH CCAT GDB: 118738 CATARACT, CONGENITAL OR JUVENILECECR9 GDB: 10796163 CAT EYE SYNDROME; CES CEPA GDB: 581848 CONTROL,CONGENITAL FAILURE OF CHED2 GDB: 9957389 CORNEAL DYSTROPHY, CONGENITALHEREDITARY CLA1 GDB: 119781 CEREBELLOPARENCHYMAL DISORDER III CLA3 GDB:128453 CEREBELLOPARENCHYMAL DISORDER I; CPD I CLN4 GDB: 125229CEROID-LIPOFUSCINOSIS, NEURONAL 4; CLN4 CPO GDB: 119070 COPROPORPHYRIACSF2RA GDB: 118777 COLONY STIMULATING FACTOR 2 RECEPTOR, ALPHA; CSF2RAGRANULOCYTE-MACROPHAGE COLONY-STIMULATING FACTOR RECEPTOR, ALPHASUBUNIT, CTS1 GDB: 118779 CARPAL TUNNEL SYNDROME; CTS; CTS1 DF GDB:132645 FACTOR D DIH1 GDB: 439243 DIAPHRAGMATIC DWS GDB: 128371 SYNDROME;DWS DYT2 GDB: 118788 DYSTONIA MUSCULORUM DEFORMANS 2; DYT2 DYT4 GDB:433751 DYSTONIA MUSCULORUM DEFORMANS 4; DYT4 EBR3 GDB: 118739EPIDERMOLYSIS BULLOSA DYSTROPHICA NEUROTROPHICA ECT GDB: 128640CENTRALOPATHIC EPILEPSY EEF1A1L14 GDB: 1327185 PROSTATIC CARCINOMAONCOGENE PTI-1 EYCL2 GDB: 4642815 EYE COLOR-3; EYCL3 FA1 GDB: 118795FANCONI ANEMIA, COMPLEMENTATION GROUP A; FACA FANCB GDB: 9864269 FANCONIPANCYTOPENIA, TYPE 2 GCSH GDB: 126842 HYPERGLYCINEMIA, ISOLATEDNONKETOTIC, TYPE III; NKH3 GCSL GDB: 132139 ISOLATED NONKETOTIC, TYPEIV; NKH4 GDF5 GDB: 433948 CARTILAGE-DERIVED MORPHOGENETIC PROTEIN 1 GIPGDB: 119985 GASTRIC INHIBITORY POLYPEPTIDE; GIP GTS GDB: 118807 GILLESDE LA TOURETTE SYNDROME; GTS HHG GDB: 118740 HYPERGONADOTROPICHYPOGONADISM; HHG HMI GDB: 265275 OF ITO; HMI HOAC GDB: 118812 DEAFNESS,CONGENITAL, AUTOSOMAL RECESSIVE HOKPP2 GDB: 595535 HYPOKALEMIC PERIODICPARALYSIS, TYPE II; HOKPP2 HRPT1 GDB: 125252 HYPERPARATHYROIDISM,FAMILIAL PRIMARY HSD3B3 GDB: 676973 GIANT CELL HEPATITIS, NEONATAL HTC1GDB: 265286 HYPERTRICHOSIS UNIVERSALIS CONGENITA, AMBRAS TYPE; HTC1 HV1SGDB: 9955009 HERPES VIRUS SENSITIVITY; HV1S ICR1 GDB: 127785 LAMELLAR,AUTOSOMAL DOMINANT FORM ICR5 GDB: 127789 ICHTHYOSIS CONGENITA, HARLEQUINFETUS TYPE IL3RA GDB: 128985 INTERLEUKIN-3 RECEPTOR, ALPHA; IL3RAINTERLEUKIN-3 RECEPTOR, Y-CHROMOSOMAL; IL3RA KAL2 GDB: 265288 KALLMANNSYNDROME 2; KAL2 KMS GDB: 118827 SYNDROME; KMS KRT18 GDB: 120127 KERATIN18; KRT18 KSS GDB: 9957718 KEARNS-SAYRE SYNDROME; KSS LCAT GDB: 119359FISH-EYE DISEASE; FED LECITHIN: CHOLESTEROL ACYLTRANSFERASE DEFICIENCYLIMM GDB: 9958161 MYOPATHY, MITOCHONDRIAL, LETHAL INFANTILE; LIMM MANBBGDB: 125262 MANNOSIDOSIS, BETA; MANB1 MCPH2 GDB: 9863035 MICROCEPHALY;MCT MEB GDB: 599557 DISEASE MELAS GDB: 9955855 MELAS SYNDROME MIC2 GDB:120184 SURFACE ANTIGEN MIC2; MIC2; CD99 MIC2 SURFACE ANTIGEN,Y-CHROMOSOMAL; MIC2Y MPFD GDB: 439372 CONGENITAL, WITH FIBER-TYPEDISPROPORTION MS GDB: 229116 SCLEROSIS; MS MSS GDB: 118743MARINESCO-SJOGREN SYNDROME; MSS MTATP6 GDB: 118897 ATP SYNTHASE 6;MTATP6 MTCO1 GDB: 118900 COMPLEX IV, CYTOCHROME c OXIDASE SUBUNIT I;MTCO1; COI MTCO3 GDB: 118902 CYTOCHROME c OXIDASE III; MTCO3 MTCYB GDB:118906 COMPLEX III, CYTOCHROME b SUBUNIT MTND1 GDB: 118911 COMPLEX I,SUBUNIT ND1; MTND 1 MTND2 GDB: 118912 COMPLEX I, SUBUNIT ND2; MTND2MTND4 GDB: 118914 COMPLEX I, SUBUNIT ND4; MTND4 MTND5 GDB: 118916COMPLEX I, SUBUNIT ND5; MTND5 MTND6 GDB: 118917 COMPLEX I, SUBUNIT ND6;MTND6 MTRNR1 GDB: 118920 RIBOSOMAL RNA, MITOCHONDRIAL, 12S; MTRNR1MTRNR2 GDB: 118921 RIBOSOMAL RNA, MITOCHONDRIAL, 16S; MTRNR2 MTTE GDB:118926 TRANSFER RNA, MITOCHONDRIAL, GLUTAMIC ACID; MTTE MTTG GDB: 118933TRANSFER RNA, MITOCHONDRIAL, GLYCINE; MTTG MTTI GDB: 118935 TRANSFERRNA, MITOCHONDRIAL, ISOLEUCINE; MTTI MTTK GDB: 118936 MERRF SYNDROMETRANSFER RNA, MITOCHONDRIAL, LYSINE; MTTK MTTL1 GDB: 118937 MERRFSYNDROME TRANSFER RNA, MITOCHONDRIAL, LEUCINE, 1; MTTL1 MTTL2 GDB:118938 TRANSFER RNA, MITOCHONDRIAL, LEUCINE, 2; MTTL2 MTTN GDB: 118940TRANSFER RNA, MITOCHONDRIAL, ASPARAGINE; MTTN MTTP GDB: 118941 TRANSFERRNA, MITOCHONDRIAL, PROLINE; MTTP MTTS1 GDB: 118944 TRANSFER RNA,MITOCHONDRIAL, SERINE, 1; MTTS1 NAMSD GDB: 681237 NEUROPATHY,MOTOR-SENSORY, TYPE II, WITH DEAFNESS AND MENTAL RETARDATION NODAL GDB:9848762 NODAL, MOUSE, HOMOLOG OF OCD1 GDB: 118846 DISORDER-1; OCD1 OPD2GDB: 131394 SYNDROME PCK2 GDB: 137198 PHOSPHOENOLPYRUVATE CARBOXYKINASE2, MITOCHONDRIAL; PCK2 PCLD GDB: 433949 POLYCYSTIC LIVER DISEASE; PLDPCOS1 GDB: 1391802 STEIN-LEVENTHAL SYNDROME PFKM GDB: 120277 GLYCOGENSTORAGE DISEASE VII PKD3 GDB: 127866 KIDNEY DISEASE 3, AUTOSOMALDOMINANT; PKD3 PRCA1 GDB: 342066 PROSTATE CANCER; PRCA1 PRO1 GDB: 128585PROP1 GDB: 9834318 PROPHET OF PIT1, MOUSE, HOMOLOG OF; PROP1 RBS GDB:118862 ROBERTS SYNDROME; RBS RFXAP GDB: 9475355 REGULATORY FACTORX-ASSOCIATED PROTEIN; RFXAP RP GDB: 9958158 RETINITIS PIGMENTOSA-8SLC25A6 GDB: 125184 ADENINE NUCLEOTIDE TRANSLOCATOR 3; ANT3 ADENINENUCLEOTIDE TRANSLOCATOR 3, Y-CHROMOSOMAL; ANT3Y SPG5B GDB: 250333SPASTIC PARAPLEGIA-5B, AUTOSOMAL RECESSIVE; SPG5B STO GDB: 439375CEREBRAL GIGANTISM SUOX GDB: 5584405 SULFOCYSTEINURIA TC21 GDB: 5573831ONCOGENE TC21 THM GDB: 439378 FAMILIAL TST GDB: 134043 RHODANESE; RDSTTD GDB: 230276 TRICHOTHIODYSTROPHY; TTDEquivalents:

The present invention is not to be limited in scope by the specificembodiments described herein. Indeed, various modifications of theinvention in addition to those described will become apparent to thoseskilled in the art from the foregoing description and accompanyingfigures. Such modifications are intended to fall within the scope of theappended claims.

Various publications are cited herein, the disclosures of which areincorporated by reference in their entireties.

The invention can be illustrated by the following embodiments enumeratedin the numbered paragraphs that follow:

1. A method for identifying a compound that modulates prematuretranslation termination or nonsense-mediated mRNA decay, comprising thesteps of (a) contacting a detectably labeled target RNA molecule with alibrary of compounds under conditions that permit direct binding of thelabeled target RNA to a member of the library of compounds so that adetectably labeled target RNA: compound complex is formed; (b)separating the detectably labeled target RNA: compound complex formed instep (a) from uncomplexed target RNA molecules and compounds; and (c)determining a structure of the compound bound to the RNA in the RNA:compound complex.

2. A method for identifying a compound that modulates prematuretranslation termination or nonsense-mediated mRNA decay, comprising thesteps of (a) contacting a target RNA molecule with a library ofdetectably labeled compounds under conditions that permit direct bindingof the target RNA to a member of the library of labeled compounds sothat a target RNA: compound complex that is detectably labeled isformed; (b) separating the target RNA: compound complex formed in step(a) from uncomplexed target RNA molecules and compounds; and (c)determining a structure of the compound bound to the RNA in the RNA:compound complex.

3. The method of paragraph 1 in which the target RNA molecule containsregions of 28S rRNA or analogs thereof.

4. The method of paragraph 1 in which the detectably labeled RNA islabeled with a fluorescent dye, phosphorescent dye, ultraviolet dye,infrared dye, visible dye, radiolabel, enzyme, spectroscopiccolorimetric label, affinity tag, or nanoparticle.

5. The method of paragraph 1 in which the compound is selected from acombinatorial library comprising peptoids; random bio-oligomers;diversomers such as hydantoins, benzodiazepines and dipeptides;vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates;peptidyl phosphonates; peptide nucleic acid libraries; antibodylibraries; carbohydrate libraries; and small organic molecule libraries,including but not limited to, libraries of benzodiazepines, isoprenoids,thiazolidinones, metathiazanones, pyrrolidines, morpholino compounds, ordiazepindiones.

6. The method of paragraph 1 in which screening a library of compoundscomprises contacting the compound with the target nucleic acid in thepresence of an aqueous solution, the aqueous solution comprising abuffer and a combination of salts, preferably approximating or mimickingphysiologic conditions.

7. The method of paragraph 6 in which the aqueous solution optionallyfurther comprises non-specific nucleic acids comprising DNA, yeast tRNA,salmon sperm DNA, homoribopolymers, and nonspecific RNAs.

8. The method of paragraph 6 in which the aqueous solution furthercomprises a buffer, a combination of salts, and optionally, a detergentor a surfactant. In another embodiment, the aqueous solution furthercomprises a combination of salts, from about 0 mM to about 100 mM KCl,from about 0 mM to about 1 M NaCl, and from about 0 mM to about 200 mMMgCl₂. In a preferred embodiment, the combination of salts is about 100mM KCl, 500 mM NaCl, and 10 mM MgCl₂. In another embodiment, thesolution optionally comprises from about 0.01% to about 0.5% (w/v) of adetergent or a surfactant.

9. Any method that detects an altered physical property of a targetnucleic acid complexed to a compound from the unbound target nucleicacid may be used for separation of the complexed and non-complexedtarget nucleic acids in the method of paragraph 1. In a preferredembodiment, electrophoresis is used for separation of the complexed andnon-complexed target nucleic acids. In a preferred embodiment, theelectrophoresis is capillary electrophoresis. In other embodiments,fluorescence spectroscopy, surface plasmon resonance, mass spectrometry,scintillation, proximity assay, structure-activity relationships (“SAR”)by NMR spectroscopy, size exclusion chromatography, affinitychromatography, and nanoparticle aggregation are used for the separationof the complexed and non-complexed target nucleic acids.

10. The structure of the compound of the RNA: compound complex ofparagraph 1 is determined, in part, by the type of library of compounds.In a preferred embodiment wherein the combinatorial libraries are smallorganic molecule libraries, mass spectroscopy, NMR, or vibrationspectroscopy are used to determine the structure of the compounds.

1. A method for identifying a compound that binds to a target RNA, saidmethod comprising (a) contacting a detectably labeled target RNAmolecule with a library of compounds under conditions that permit directbinding of the labeled target RNA to a member of the library ofcompounds and the formation of a detectably labeled target RNA: compoundcomplex, wherein the target RNA is a region of 28S rRNA or contains apremature stop codon; and (b) detecting the formation of a detectablylabeled target RNA: compound complex.
 2. A method for identifying acompound to test for its ability to modulate premature translationtermination or nonsense-mediated mRNA decay, said method comprising: (a)contacting a detectably labeled target RNA molecule with a library ofcompounds under conditions that permit direct binding of the labeledtarget RNA to a member of the library of compounds and the formation ofa detectably labeled target RNA: compound complex, wherein the targetRNA is a region of 28S rRNA or contains a premature stop codon; and (b)detecting a detectably labeled target RNA: compound complex formed instep(a), so that if a target RNA: compound complex is detected then thecompound identified is tested for its ability to modulate prematuretranslation or nonsense-mediated mRNA delay.
 3. A method for identifyinga compound that binds to a target RNA, said method comprising detectingthe formation of a detectably labeled target RNA: compound complexformed from contacting a detectably labeled RNA with a member of alibrary of compounds under conditions that permit direct binding of thelabeled target RNA to a member of the library of compounds and theformation of a detectably labeled target RNA: compound complex, whereinthe target RNA is a region of 28S rRNA or contains a premature stopcodon.
 4. A method of identifying a compound that modulates prematuretranslation termination or nonsense-mediated mRNA decay, said methodcomprising: (a) contacting a detectably labeled target RNA molecule witha library of compounds under conditions that permit direct binding ofthe labeled target RNA to a member of the library of compounds and theformation of a detectably labeled target RNA: compound complex, whereinthe target RNA is a region of 28S rRNA or contains a premature stopcodon; and (b) detecting a detectably labeled target RNA: compoundcomplex formed in step(a), so that if a target RNA: compound complex isdetected, then (c) contacting the compound with a cell-free translationmixture and a nucleic acid sequence comprising a regulatory elementoperably linked to a reporter gene, wherein the reporter gene contains apremature stop codon; and (d) detecting the expression of the reportergene, wherein a compound that modulates premature translationtermination or nonsense-mediated mRNA decay is identified if theexpression of the reporter gene in the presence of the compound isaltered relative to the expression of the reporter gene in the absenceof the compound or the presence of a negative control.
 5. A method ofidentifying a compound that modulates premature translation terminationor nonsense-mediated mRNA decay, said method comprising: (a) contactinga detectably labeled target RNA molecule with a library of compoundsunder conditions that permit direct binding of the labeled target RNA toa member of the library of compounds and the formation of a detectablylabeled target RNA: compound complex, wherein the target RNA is a regionof 28S rRNA or contains a premature stop codon; and (b) detecting adetectably labeled target RNA: compound complex formed in step(a), sothat if a target RNA: compound complex is detected, then (c) contactingthe compound with a cell containing a nucleic acid sequence comprising aregulatory element operably linked to a reporter gene, wherein thereporter gene contains a premature stop codon; and (d) detecting theexpression of the reporter gene, wherein a compound that modulatespremature translation termination or nonsense-mediated mRNA decay isidentified if the expression of the reporter gene in the presence of thecompound is altered relative to the expression of the reporter gene inthe absence of the compound or the presence of a negative control. 6.The method of claim 1, 2, 3, 4 or 5, wherein each compound in thelibrary is attached to a solid support.
 7. The method of claim 6,wherein the solid support is a silica gel, a resin, a derivativedplastic film, a glass bead, cotton, a plastic bead, a polystyrene bead,an aluminum gel, a glass slide or a polysaccharide.
 8. The method ofclaim 1, 2, 3, 4 or 5, wherein the library of compounds is attached to achip.
 9. The method of claim 1, 2, 3, 4 or 5, wherein the detectablylabeled RNA is labeled with a fluorescent dye, phosphorescent dye,ultraviolet dye, infrared dye, visible dye, radiolabel, enzyme,spectroscopic colorimetric label, affinity tag, or nanoparticle.
 10. Themethod of claim 1, 2, 3, 4 or 5, wherein the compound is a combinatoriallibrary of compounds comprising peptoids; random biooligomers;diversomers such as hydantoins, benzodiazepines and dipeptides;vinylogous polypeptides; nonpeptidal peptidomimetics; oligocarbamates;peptidyl phosphonates; peptide nucleic acid libraries; antibodylibraries; carbohydrate libraries; or small organic molecule libraries.11. The method of claim 10, wherein the small organic molecule librariesare libraries of benzodiazepines, isoprenoids, thiazolidinones,metathiazanones, pyrrolidines, morpholino compounds, or diazepindiones.12. The method of claim 1, 2, 3, 4 or 5, wherein the detectably labeledtarget RNA: compound complex is detected by electrophoresis,fluorescence spectroscopy, surface plasmon resonance, mass spectrometry,scintillation, proximity assay, structure-activity relationships (“SAR”)by NMR spectroscopy, size exclusion chromatography, affinitychromatography, or nanoparticle aggregation.
 13. The method of claim 1,2, 3, 4 or 5, wherein the method further comprises determining thestructure of the compound.
 14. The method of claim 13, wherein thestructure of the compound is determined by mass spectroscopy, NMR, X-raycrystallography, Edman degradation or vibration spectroscopy.
 15. Themethod of claim 1, 2, 3, 4 or 5, wherein the premature stop codon isUAG, UGA or UAA.